Images and Thinking
Critique of arguments against images as a medium of thought
The Way of Ideas died an ignoble death, committed to the flames by behaviorist empiricists. Ideas, pictures in the head, perished with the Way. By the time those empiricists were supplanted at the helm by functionalists and causal theorists, a revolution had taken place in linguistics and the last thing anyone wanted to do was revive images as the medium of thought. Currently, some but not all cognitive scientists think that there probably are mental images - experiments in cognitive psychology (e.g. Shepard and Metzler 1971) have shown it to be plausible to posit mental images. Even so, the phenomenon of mental imagery has been largely regarded as peripheral in cognition, perhaps even epiphenomenal. Images cannot fix the content of thought (intentions, rules), the Wittgenstein story went. The central processes of thought, so the post-Wittgenstein story goes, require a propositional representation system, a language of thought, universal and modeled on the machine languages of computers. The language of thought is compositional, productive, and, leading advocates argue, has a causal semantics. Images lack all of these essential qualities and so are hopeless as key players in thinking.
Throughout the history of thinking about the role of images in thought, the focus has been almost entirely on visual imagery. Even in recent work in cognitive psychology, those who prominently have been interested in images (e.g. Shepard and Metzler, Steven Kosslyn, Steven Pinker) have focussed almost exclusively on visual images [notable exceptions - Reisberg, Gathercole, Baddeley.]. However, visual images are not the only form of mental image, and may not be the most important type for human cognition. In addition, focussing on visual images and their peculiar pictorial and perspectival properties leads to denying a proper place for images in thinking. So I shall argue.
Most of what follows attempts to reply to what appear to be the main arguments against the view that images play a central role in thought. Thus I'll indirectly defend the possibility that mental images play a very important role in human cognition, including inference. The arguments I reply to come from Zenon Pylyshyn, Michael Tye, Jerry Fodor and especially Steven Pinker. Other arguments, from Gilbert Ryle, have, in my view, been adequately responded to by Tye 1991. I'll divide the arguments into four groups: psychological arguments (including regress and simplicity arguments), syntactic arguments (concerning the structural form mental representations must take to be suitable for thought), semantic arguments (concerning the semantic properties of images), and finally a Big Picture argument from the history of views about the role of images in philosophy and psychology.
I. Psychological arguments
1. Regress of image viewers argument. Mental images would require an inner homunculus to view them (this was Pylyshyn's first argument against mental images, as reported by S. Kosslyn 1994 p. 6).
So let us begin with a regress. I suppose that many now regard this as a poor objection. Clearly it appears to be based on a particular, narrow, view as to what an image must be -- it appears an image must have an appearance, in the form of a light emitting or reflecting surface. I myself am of two minds about the merits of the argument. On the one hand, a mental image is not optical, and so cannot literally be viewed. In that way, the objection, at least in this bald form, is misguided. Images can be latent, as on exposed but undeveloped photographic film, or they can exist in digital form in an electrical or magnetic medium. So there appears to be an established (and reasonable) current use of "image" in which some images at least need not be visible. But it is also true that an image, mental or otherwise, must be interpreted to play a mental role. An image in the brain that has no causal role in producing thought and behavior is of no more psychological interest than an optical image projected on the skull. Both are in the head, but neither is mental. Indeed, this thought, the thought that images require interpretation, especially propositional interpretation, is central to several objections to images as a medium of thought. These objections are independent of the regress argument, but share with it the basic intuition that thought must involve more than images.
A better model for mental imagery than a viewer looking at a photo comes from images in computers. This model has changed over the years - Tye quotes Ned Block as supposing computers are best suited to operating on sentential representations (Tye 1991 pp. 45-6). At the time Block was writing, it was generally true that computers either did number crunching or handled sentential input. Increasingly though computers process imagistic representations (originally, this was too costly - now inexpensive video game computers are largely very proficient image processors). This trend toward image processing will no doubt continue. These imagistic representations in computers take several forms. Some are bitmaps, where each quantum element of the image ("pixel") has a color and brightness value, and also various other less data intensive forms - vector graphics, and compressed images. [See Berkeley on "minimum sensible", e.g. Principles sec. 132 for early recognition of limits to visual resolution, quanta as elements of an image, as opposed to the mathematical continuum.] Commonly, computers are involved in image generation rather than comsumption, as in the game systems where images, and lots of them, are the output. But when a computer is connected to a video input device, the system may analyze the image - as with face recognition (developed for security applications), and more importantly with video camera sensors -- "eyes" --for robots. In these cases, the images are inputs to a causal computational system that controls (non-imaging) behavior -- that is, the images are not just used to produce images as output, as in an image enhancement or editing system, but to control other forms of behavior . In these cases the internal images are, as Tye puts it, functional images. They have causal properties (as when an inner image derived from a camera causes a robot to turn to avoid a visible obstruction, or a computer image of a face leads to output of a sentence naming the owner of the face) which are the same causal properties that a visible image could have when viewed by a human. But no viewing homunculus is required. Nevertheless, the causal process is, at least in part, the functional equivalent of a homunculus - and so, in that non-damaging way, there is a core of truth to the argument.
An image in a computer is similar to a latent image on photo film in that it is a non-propositional store of information. Just as the exposed film contains an array of discrete bits of information (in the form of chemical states in crystals of photosensitive material), so the computer contains a (logical rather than physical) array of discrete bits of information as electrical states of a silicon chip, or magnetic domains on a disk surface (the exact physical form will change as the state-of-the-art of computer long and short term storage evolves). But the latent image on film just sits there, inaccessible, until developed. The image in a computer is available for processing and analysis - and control, based on content, of further processes. In this respect it is more like a similarly invisible (nonoptical) image in the brain than it is like the latent photo image. No homuncular viewer is required because the brain image is in a viewer, a part of the causal process that constitutes the viewer.
(See Tye for a discussion of related psychological arguments from Ryle 1949, Concept of Mind.)
2. The Argument from Simplicity - we must posit propositional representations, and it is simpler psychology to posit only a single representation system (Kosslyn 1984 p. 8). If one can account for behavior on the basis of a single, propositional, representation system, one should not posit additional systems.
This preference for simplicity is certainly reasonable methodology in general. The problems with it in the present case are twofold: first, one can't account for the psychological facts, including subjects' reports or their experience, without seriously ad hoc explanations of the evidence. Second, the system we are dealing with is quite possibly a biological kludge - layers of neural systems to deal with particular info processing chores. The behavioral evidence is rife with revealing lapses from rationality, competence, efficiency. Particularly revealing to my mind are our notorious deficiencies at mathematics. Arithmetic is the one area where even simple digital electronic devices excel. An efficient representation system makes arithmetic calculations trivial. Yet humans struggle with multi-digit multiplication, and collapse under the slight demands of mental division. So: possibly one could build a system with a unified representational code; but there is no reason to believe we, as real live evolved systems, have a central nervous system governed by considerations of elegance and simplicity.
It is worth noting that in How the Mind Works, Steven Pinker presents considerations that count against viewing mental representations as all in a monolithic code. Pinker points out that the empirical evidence supports the view that there are at least four distinct forms of mental image: visual images, phonological images, grammatical representations, and mentalese (89-90). In the section that follows, titled "Why so many kinds of representations?", Pinker argues that it is generally more efficient to have modular organization (in brains as well as in good computer programming practice), and that modules serving different purposes will be best served by differing representation systems. Different representational systems lend themselves better to some tasks than to others - one needs the right data representation for the job at hand. True enough, but again, it is worth bearing in mind that efficiency may not be the ruling principle in a layered evolved system. Subsystems originally serving one purpose may be pressed into new roles, and efficiency yield to the mere fact that it is possible to perform the roles at all.
Presumably there may be many other kinds of mental representation beyond the four Pinker lists - there may be spatial maps, including body image, that are not visual. There appear to be representations of sequences of motor activity. These are splendidly manifest in playing musical instruments, dancing, and sub-vocalization. So it seems reasonable to hold that considerations of efficiency may count against a unified code, and further, that considerations of efficiency may not be paramount in a complex evolved system such as the brain (designed by a mindless committee, as it were, working over eons).
II. Syntactic arguments:
3. Compositionality (having a constituent structure) is essential to thought B and compositionality requires propositional reps. Images are not capable of compositionality. (see Fodor 1987 Psychosemantics: "What makes the story a Language of Thought story … is the idea that these mental states that have content also have syntactic structure -- constituent structure in particular -- that's appropriate to the content that they have." That appropriateness is cashed out as combinatorial semantics - meaning of whole is function of meaning of parts. The implication is that this consideration counts in favor of a propositional representational system, and against rivals such as images. (on the difference between prop and image, see Kosslyn 1984 Image and Brain pp. 4-5, Pinker pp. 290-293, and Tye 1991).
Reply: Two points: First, images can be compositional. The cat can be depicted worrying the rat that eats the malt that is in the house that Jack is building. One can depict a circle in a square, and then that square in a larger rectangle. If one can depict Pa Kettle standing to the right of Ma Kettle, and one can depict Ma Kettle to the right of a cow, one can combine them and depict Ma Kettle between a Pa and a cow. If one depicts the circle in the square, and in the same image depicts that square as in a rectangle, then one depicts the circle as in the rectangle. Same for other transitive spatial relations (above, left of, under).
Second, and more importantly, there appears to be presupposed what I shall argue is a false dichotomy between propositional representation system (mentalese) and imagistic representation systems. I'll discuss this more fully in responding to argument 6, the argument that representations must be capable of being true or false.
4. Productivity and systematicity are essential to thought (human thought, that is) B and images lack these features. These features set off Language of Thought theories from other forms of intentional realism (that is, theories that place meaning bearers in the head, as opposed to adverbial and eliminativist theories). The terminology is from Fodor. Productivity is the potential infinity of representations. Our "ability to understand and produce sentences…is --as I shall say -- systematic : by which I mean that the ability to produce/understand some of the sentences is intrinsically connected to the ability to produce/understand many of the others."
Reply: Just as they can display compositionality, images can display forms of productivity and systematicity. One can depict a circle in a square, and then that square in a larger rectangle, and go on to depict the rectangle inside an ellipse, and so forth, indefinitely, limited only by performance capacity . Visual images display systematicity when one substitutes one figure for another in an image: typically if you can depict Curly hitting Moe, and you can depict Shep, then you can depict Shep hitting Moe, and also Moe hitting Curly.
If you can depict the Parthenon against a sunny sky, and you can depict a cloudy sky, you can depict the Parthenon against a cloudy sky. One doesn’t need to think literally of art as language to note some shared traits, in particular representational flexibility, combinatorial potential and articulateness.
[And again, images of sentences will inherit all these properties, compositionality, systematicity, productivity.]
III. Semantic arguments:
5. Resemblance is an inadequate basis for semantics. If thoughts were images, thought would require resemblance semantics.
Reply: The complaint about the inadequacy of resemblance is well-taken. But, history aside, there's no reason to take resemblance as the basis of semantics for images, mental or otherwise. Images can have the same causal, indicator semantics as do (pace Fodor) the non-imagistic constituents of mentalese. This is clear even with non-mental images - a photograph of Mark is a photo of him, even though it also resembles his twin brother. It's representational qualities come from its causal history. The same can be true of mental images. My mental image of Mark is of him because it was caused by him.
Of course, photos, good photos at least, tend to resemble, in the relevant respects, the objects that cause the photographic image. And this resemblance aids us in recognition - given a photo, we can often determine who it depicts solely on the basis of internal features of the image. And then again, often we cannot. In any case, it is crucial to note this is a recognitional, epistemic role that resemblance has -- not a semantic role. What something represents and what one takes it to represent are quite different things, the first semantic, the second epistemic. The semantic relation of representation is not constituted by resemblance in the case of photos, nor is it in the case of mental images.
Supposing that mental representations have a causal or indicator semantics is compatible with supposing that the representations are images. Thus imagistic models of mental representation are not saddled with an untenable resemblance semantics. (There are of course many other objections to resemblance as an account of representaion - resemblance is symmetric, representation not. Resemblance is vague, representation is not.)
Recognizing an indicator semantics for images creates pregnant possibilities: images can then represent states of affairs, including abstract, non visual, states of affairs - and not just objects. And then images with such a semantics could be true or false. Next argument!
6. Thoughts must be capable of being true or false. This is probably the most important consideration for driving a wedge between pictorial and propositional representations. The latter are semantically evaluable. The former, being neither true nor false, are hopeless for anything but a peripheral role in thought, for they are incapable of partaking in all the good things: truth-functional connectives, modality, tense.
Reply: The presupposition appears to be that images must represent by resemblance. Images can represent states of affairs - without resembling them. Consider icons. Often an illuminated oil-can icon on an automobile instrument cluster represents that the oil pressure is below safe operating limits. It does not represent an oil-can. The icon does the work of a warning in natural language. It's shape is mnemonic - an aid to recognition, not a determinant of semantics. It's semantics is paradigmatic of indicator semantics. As an artifact, it has a purpose and can misrepresent - if the oil pressure light goes on when oil pressure is normal, it gives a false indication.
While icons may represent states of affairs, but they do not, or do not easily, admit of compositionality and systematicity. Fortunately, there are alternative ways in which images may be true or false.
The important additional way in which images can be true or false is by being sentential. E.g. you as reader are having a visual image right now B of this sentence B and quite possibly an accompanying auditory or proprioceptive image of this sentence, an image of how it would sound spoken. It is a false dichotomy to suppose that representations must be sentences or (exclusive) images. For images can be of cars, or dogs, cabbages or kings -- or of sentences in natural languages. Visual images can be of printed sentences. Auditory images can be of spoken language. (And proprioceptive imagery might be of speech -- or of writing, signing, or tapping out Morse code.)
This changes everything. In place of a false dichotomy of image (almost always visual, in the literature) opposed to mentalese sentence, with all the virtues necessary for thought going to the sentential, we can consider the possibility that imagistic representations can have all the requisites for thought: systematicity, truth values, and causal potential mirroring implicature.
Mental sentences can take the form of acoustic, visual, or proprioceptive images. A mental image of a sentence is a tokening of that sentence, just as is an inscription or an utterance. A tokening of an auditory image of a sentence, unlike a tokening of a mentalese sentence, will represent acoustic and phonological features of an utterance of the sentence, including all or some of pitch, rhyme, duration and phrasing. An auditory image will be of an image of a sentence in a particular natural language. It will be a token of a natural language sentence inside the head. As such, imaging of sentences in natural language inherits important features of natural language: systematicity and truth values. And unlike sentences outside the head, imaged sentences, as states of a complex causal system, can have inference roles.
In light of this possibility of sentential images as a medium of thought, let us turn to Steven Pinker's arguments against supposing images are important for thinking. Pinker's interests are broad, his knowledge encyclopedic, and he attempts to develop at general account of cognition. Pinker's views are especially interesting because he has done empirical work on imaging, and recognizes that some thought is imagistic. However he argues, both in The Language Instinct and in How the Mind Works, that the primary medium of thought must be a non-imagistic representation system, mentalese.
Pinker's arguments are, broadly speaking, semantic. They concern the adequacy of images as content bearers. Pinker argues that images are too impoverished semantically to be a medium of thought. Generally, Pinker discusses visual images, and almost always images of non-linguistic objects..
In one of the few places where he mentions phonological images, Pinker (How the Mind Works p.89) says that phonological images are "a stretch of syllables that we play in our minds like a tape loop, planning out the mouth movements and imagining what the syllables sound like. This string-like representation is an important component of our short-term memory…."
Pinker notes that these representations last only 1-5 seconds, with a capacity limit of 4-7 "chunks". This form of representation is contrasted with grammatical representations, and, most importantly, with mentalese - "the language of thought in which our conceptual knowledge is couched….Mentalese is the medium in which content or gist is captured." Mentalese is the mind's lingua franca - it carries information between modules, and, presumably, is suitable for representation in long term memory (since phonological memory is so short).
It seems clear how Pinker understands the general deficiencies of images: visual images, with their concrete objects, are not suitable for abstract thought. Phonological images are not of much interest -- they are too fleeting. (And they are too connected with speech, and so not relevant to thought in animals and prelinguistic children - not suitable as a general medium of thought). Phonological images have no semantic role to play, they are only used for imagining sound. So we are left with a non-imagistic medium, mentalese, for the real work of thought.
But note the pregnant role allowed acoustic images: phonological images are for "planning out our mouth movements" -- that is, speech. Inflate this role a bit and you have phonological images as an ingredient in planning what to say, not just how to say it. And this accords with the introspective evidence that we think, at least some of the time, in phonological (or proprioceptive) representations of natural language (for more on this, see Carruthers 1996 and Cole 1997, and for more of empirical evidence see e.g. Smith, Wilson and Reisberg 1995).
Later (pp.294-298), Pinker marshals four or so arguments specifically against images as a medium of thought. He begins with the remark, "Imagery is a wonderful faculty, but we must not get carried away with the idea of pictures in the head." (294)
First argument: "Images are fragmentary." One can't image a whole visual scene, just glimpses of parts. And images are always from one vantage point, "distorted by perspective".
Some visual images are perspectival. But two-dimensional icons -- including the letters on this page -- are not. And even when visual images are perspectival, there often are privileged perspectives that show things the way they are, as it were. For some objects of thought, the privileged perspectives are from overhead, God's Eye views (shared with cartographers), and, for other objects, face on. Many artifacts -- televisions, automotive dashboards, building facades, hairstyles -- have privileged perspectives. These perspectives are the most appropriate for viewing the objects - and often, for general thinking purposes, for imagining them as well. If I want to tell you whether frogs have tails, or Clinton has a moustache, or how many windows are on the front of my house, perspectival images are likely the best way to do this. They are a substitute for looking at the rear end of a frog, Clinton's face, the front of my house - and all these lookings would be perspectival.
Pinker goes on to suggest that there is a way that we can image a whole object after all - "To remember an object, we turn it over or walk around it, and that means our memory for it is an album of separate views. An image of the whole object is a slide show or pastiche."(294) But if that solves the limit of the perspective inherent in imagery, why is there any residual problem with visual images as vehicles of thought? In particular, what is wrong with pastiches and slide shows?
More seriously, this objection to images betrays a basic limiting presupposition that permeates discussions of images in thought -- namely that the representational properties of images are limited to those appropriate to pictorial likeness. But this is just to presuppose that the only possible semantics for images would be resemblance. By now it should be clear that this is too provincial in imagining the semantic possibilities. (See the discussion above of the argument based on resemblance semantics).
Second argument: "A second limitation is that images are slaves to the organization of memory. Our knowledge of the world could not possibly fit into one big picture or map. There are too many scales, from mountains to fleas, to fit into one medium with a fixed grain size."(294)
If this is distinct from the limits of images listed heretofore, it appears to be weak. First, again the focus is entirely on visual images. If our knowledge of the world can be stored in mentalese sentences, why can't it be stored in images of natural language sentences? Second, one needn't suppose that thinking in images requires that one have just one big image as ones life-thought. Images might be linked to one another, as they are in a hypertext image map. I may image my Buick parked on the driveway. The Buick part of the image may be causally linked to images of transmissions (exploded diagrams), images of gas stations, petroleum fields, a "How Things Work" cutaway image of an oil well descending to the oil far below, linked to images of dinosaurs and other sources of oil, etc. And the driveway part may be linked to maps of my neighborhood, the city, the world. No single image captures my thought, but images are linked to others images to form an interlaced web of images.
Against this possibility, Pinker continues: "And our visual memory could not very well be a shoebox stuffed with photographs, either. There would be no way to find the one you need without examining each one to recognize what's in it. (Photo and video archives face a similar problem.) Memory images must be labeled and organized within a propositional superstructure, perhaps a bit like hypermedia, where graphics files are linked to attachment points within a large text or database." (294-5)
This argument (and the next) are probably of central importance; they motivate the requirement of propositional representations distinct from images. But the point is misleading. Hyperlinks are causal links, not a "propositional superstructure". The claim that images must be organized is not equivalent to the claim that they "must be labeled and organized with a propositional superstructure". Photo albums organize, not with a propositional superstructure, but proximity. Museums label, but they also arrange exhibits in physical space, sometimes with a constrained path so that patrons will encounter exhibits in a predetermined order.
The demand for explicit propositional labeling runs counter to a deeper point Pinker raises earlier - a point going back at least to Lewis Carroll. Carroll's "What Achilles said to the Tortoise" has a moral, which Pinker draws in the course of his discussion (98-99). The moral is that inference connections can't all be by explicit representations of inference rules. We don't need a propositional representation of modus ponens, even though modus ponens is a very important, indeed essential, component of the inference. Yet we manage to infer. So why would we need propositional representations of what to do with images? We just, as Pinker says echoing Nike, p. 99, do it!"
Similarly with images. It is entirely possible that images be causally connected with one another, as associationist psychologists supposed. No propositional superstructure is needed for an image of a lion to cause proprioceptive images of running away. But while this may address this specific objection, it leaves the problem that ordinary visual images are incapable of representing abstractions.
Suppose now that the image is itself a tokening of a sentence - an image of a sentence, for example, a phonological image of "All cows decry carnivores". Must we suppose that this image requires an additional label in order to have the content "All cows decry carnivores"? Not for the reasons adduced against, say, a visual image of cows drawing back from a wolf. The question is merely whether a phonological image of "all cows decry carnivores" can serve as a vehicle of thought - can it, e.g., enter into inferences by causing new tokenings of sentences. And in order to do this, do we need an explicit propositional representation, say of "all strings beginning with "all" are suitable for Barbara inferences". Achilles all over again. We don't need labels on these strings because they bear their inference potential on their sleeves. (More on this in Cole 1997 - "I don't think so")
"Finally, images cannot serve as our concepts, nor can they serve as meanings for words in the mental dictionary."( 296) The moral derives from British empiricism - as Berkeley noted in his critique of Locke's doctrine of abstract ideas, the imagistic approach to understanding thought could not account for abstract ideas. Pinker says that Berkeley's solution, denial of abstract ideas, is desperate and implausible (and one might remark, somewhat self-defeating for a philosopher).
But Berkeley did not deny that we have abstract thought. He only denied that we have abstract ideas (images). He and Hume had the solution to the problem endemic to previous imagistic approaches, but did not fully appreciate the scope of that solution. The solution is not to suppose that the pictorial content of an imagistic representation determines its inference role. Berkeley says that in thinking about triangles, we cannot form an image of some triangle which is neither equilateral, isosceles nor scalene - or all three at once. But we can use an arbitrary triangle and ignore its particular shape. That is the functional role of the mental triangle image does not depend upon its pictorial properties.
This may appear to be Pinker's solution as well - except that he supposes that (if an image had an inference role) the inference role must depend upon an associated propositional label. But
all Berkeley seems to hold is that there is a purely negative restriction that is needed here - we block the inference from "this imaged triangle is isosceles" to "all triangles are isosceles" or "The Pythagorean theorem, demonstrated with this triangle, only applies only to isosceles triangles." In abstracting form the details of my concrete image, I ignore certain features of the image. In making this particular triangle stand for all, I ignore its idiosyncratic properties. This appears to fall far short of having to embed each image in a "propositional superstructure".
There is probably nothing peculiar to our use of images here. Medical students dissect cadavers. Each cadaver is representative of human anatomy in general in very many important respects, but will differ in some respects. In dissecting, med students concentrate on the general anatomical features, and abstract from eye color, fingernail length, and stomach contents.
If images have a role in thought and inference, that role must be determined by something. So far we have seen that there are problems with taking the role to be determined by the pictorial content of the image. The large question is we are considering here is whether inference role must be determined by attached propositions. Michael Tye gives a theory of images in which they are propositionally labeled data structures (1991 p 90ff). This may account for introspective reports of images in thought -- but surely the proposition is doing all the heavy lifting as far as thought is concerned! That point will discussed in conjunction with the next consideration Pinker advances against the adequacy of images as a medium of thought.
Fourth argument - image ambiguity
Pinker says that I can't represent man, the abstraction, by an image of a typical man, Fred MacMurray, because that image represents tall man, adult, human, actor, etc -- but these are all things we have no trouble distinguishing. "Pictures are ambiguous, but thoughts, virtually by definition, cannot be ambiguous." (297) (This parallels a point raised in The Language Instinct against the idea that we might think in natural language - natural language is ambiguous, thought is not. I reply to that argument in Cole 1997.) Pinker continues, "When vision leaves off and thought begins, there's no getting around the need for abstract symbols and propositions that pick out aspects of an object for the mind to manipulate."
Here I think we find clear evidence of the problem created by focussing on visual images -- and a particular subset thereof, images of non-linguistic objects. Phonological images of spoken natural language are abstract and propositional, dealing with aspects of the objects the imaged natural language sentences are about. So are visual images of written language - although it appears the latter do not play an important role in the thought of most humans (visual imagery of symbols may play an essential role in mental arithmetic, though, as in our laborious attempts at mental long division). A visual image of say a written sentence may be in a particular color on a background that may have a color, and will be a particular size, etc. Here the Berkeley strategy kicks in - none of this plays a role in inference. This abstraction from inscriptional detail is general -- the voltage with which logical "1"s are stored in computer RAM do not determine inference role (as long as they are within spec, that is, discernible by the machine), and some of the neurophysiological details of mentalese instantiations are content irrelevant and are causally inert in inference.
"[Images] cannot serve as meanings for words in the mental dictionary". Hmmm. One would hope that images are not the meanings of words -- whether in a mental dictionary, or in Webster's. We wish to talk and think about the World. We are indeed a (small) part of that world, and of disproportionate interest to ourselves, but much of our thought and talk is, and needs be, about the world around us. Thus it would be very odd indeed if say "dog" or DOG meant a mental image. The view at issue is whether images can be vehicles of thought, not the objects of thought. (They can of course, but only in arcane psychologizing.)
Pinker continues by arguing that you could not represent a negative concept, like "not a giraffe' using images. And one could not represent disjunctions, or universal propositions like "all men are mortal".
Again, these points surely hold only against pictures of extra-linguistic objects, or images that function just like pictures of objects (giraffes, e.g.) except that they are in the head. Images of extra-linguistic objects are the typical denizens of photo albums. But many of the images important for thought are not depictions of extra-linguistic objects. If thoughts run through my head, like a tune runs through my head, it will be as phonological images. Not images of noises, but of speech. These images of speech inherit the semantics of speech - and, as we have seen, such important features as compositionality, productivity and systematicity.
IV. Final Argument: The way of ideas led to a coal pit.
Reply: So it did. But only when coupled with an empiricist theory of meaning, which reduced semantics to relations between ideas. Get rid of the empiricism and anti-realism, embrace causal semantics and maybe some connectionist image processing psychology, shift attention from visual images to phonological and proprioceptive imagery of spoken language, and you have a viable alternative to mentalese theories.
Block, Ned 1982 (ed.) Imagery
Carroll Lewis (Charles Dodgson) 1895 What the Tortoise said to Achilles and other riddles
Carruthers, Peter 1996 Language, Thought and Consciousness
Churchland, Paul [space state semantics source, see holism for ref]
Cole, David 1997 "I Don't Think So" http://www.d.umn.edu/~dcole
Fodor 1975 The Language of Thought
Fodor 1987 Psychosemantics
Kosslyn 1984 Image and Brain
Pinker, Steven 1994 The Language Instinct
Pinker, Steven 1997 How the Mind Works
Pylyshyn in Block 1982
Reiser (ed.) 1992 Auditory Imagery: Lawrence Erlbaum
Ryle, Gilbert 1949 The Concept of Mind: Hutchinson
Shepard, R.N. and Metzler, N. 1971 "Mental Rotation of Three-dimensional Objects" Science 171 pp. 701-703.
Smith, D., M. Wilson and D. Reisberg 1995 "The Role of Subvocalization in Auditory Imagery" Neuropschologia 33, 11 pp.1433-1454
Tye, Michael 1991 The Imagery Debate: MIT Press