Rodney A. Brooks, {Cambrian Intelligence: The Early History of the New AI}, Cambridge, MA: MIT Press, 1999, xii + 199 pp., $21.56 (paper), ISBN 0-262-52263-2. {Cambrian Intelligence} is a collection of papers from journals (chapters 1, 2, 4, 5, and 7), conferences (chapters 3 and 8), and a technical report (chapter 6) by Rodney Brooks (and Maja J. Mataric, first author of chapter 3) from 1986 to 1991. Each chapter is preceded by an introductory paragraph. References for all papers are collected at the end of the book. As a cognitive scientist with ties to both computer science and psychology I found it valuable to have these papers bound together. I had been meaning to spend more time with Brooks' material and this provided a convenient package to achieve that goal. The book describes Brooks' theoretical approach, and several robotic systems and the {subsumption} architecture used to design the control systems for these robots. As the book presents the ideas of subsumption architecture multiple times across the chapters, I'll summarize the method and then describe the robot systems and their control systems in more detail. Then I summarize his theoretical approach. Subsumption architecture organizes robot control systems into modules, and layers of modules. Each module has a set of inputs and a set of outputs and performs a computation on the inputs thus producing outputs. Layers implement particular behaviors or activity patterns for the robot and are built by the system designers from the bottom up. Lower numbered levels form the more primitive behaviors and higher numbered behaviors utilize these behaviors in constructing more complex or higher-level behaviors. Level 1 (the bottom layer) is built and debugged first. An example level 1 behavior in the six-legged robot {Genghis} causes the robot to {Standup} (p. 32). Level 2, residing on top of level 1, is built and debugged after level 1 is ready and working. Higher levels can receive data produced by lower level modules and inject data into the lower level modules thus altering normal data flow in the lower level. In Genghis, level 2 generates a {Simple walk} behavior. In order for the {Simple walk} to occur, clearly the robot must {Standup} on its six legs, and thus the higher-level behavior interacts with the lower-level behavior. In Genghis, level 2 receives inputs from level 1 and also provides inputs to level 1 modules. Level 2 examines level 1 data. Level 2 also injects data into the modules of level 1 thereby altering normal data flow in level 1. When a higher-level behavior (layer) examines the data of a lower level, it does not interfere with the functioning of the lower level behavior (layer) because the computations of the lower level are not affected by this type of inter-layer connection. When a higher-level behavior injects data into the modules of a lower-level, it alters the data flow in the lower layer, and through this change of data it affects the computation occurring in the lower level. In the case of Genghis, the level 1 modules do not have any inter-module communication and hence do not have any inputs strictly from level 1 (see p. 33). In the case of level 2 of Genghis, most of the modules have inputs from that layer or from layer 1. Thus, level 3 in Genghis, when it injects its outputs into the level 2 inputs, provides alternate data input to the level 2 modules. The inputs provided from a higher level to a lower level are temporary, and after a specified time delay, the lower level modules' normal inputs are resumed. The intended principle in subsumption architecture is that the higher-level behavior should generally not cause the lower-level to fail. Higher-level behaviors are well and good, but if we spend all of our time planning to reach a goal, and don't avoid that oncoming steamroller, the planning has little benefit! Since the lower-level modules generally have "normal" inputs (intra-layer or from a still lower level) and since the replacement of a lower-level module's inputs by a higher-level module's output generally are temporary, the lower-level will generally maintain its normal functioning. To complete the description of the behaviors designed for Genghis, the remaining layers are: force balancing (level 3), leg lifting (4), whiskers (5), pitch stabilization (6), prowling (7), and steered prowling (level 8). This set of conventions arbitrating between higher-level and lower-level behaviors is perhaps the most interesting aspect of Brooks' subsumption architecture. Each layer of the control system provides a behavior for the robot and the mechanisms of data communication from lower to higher levels and data injection from higher levels to lower levels provide a means of {combining the behaviors} in a structured manner so that generally the lower levels (with higher priority programmed goals) maintain their operation. To the extent that an artificial intelligence can combine (and re-combine) its existing behaviors, it can generate new behaviors. Brooks terms this combination of behaviors {emergent} behavior. If these new (emergent) behaviors help the system reach it's goals, then they will be adaptive for the system. In a physical robot implementation, with all computation onboard, modules are scheduled tasks on processing elements, each processing element comprising RAM + CPU + special purpose hardware + connection interface. Any given robot typically has more than one processing element (e.g., Genghis has four). In principle, each module could have a dedicated processing element. Modules are programmed in an augmented finite-state machine style. Inter-module communication is asynchronous in two ways. One module's output sent to another module's input may be lost if the receiver does not read a first message before a second message overwrites it in the input buffer. Also, modules run asynchronously -- they are not generally synchronized with a common clock. In physical robot implementations, connections between modules are physical wires between processing elements and are generally low-bandwidth (e.g., between 1 and 24 bits in width, at about 25 Hz) though some work has been done on vision processing with higher bandwidth communication (e.g., see Chapter 4). Communication between modules involves data (messages) from sensors, data (messages) to actuators, messages from other modules, replacement of normal message input ({suppression}), and blocking normal message output ({inhibition}). Modules in any layer can be connected to sensors and this takes the form of input connections to a module. Modules in any layer can have outputs connected to actuators (e.g., a motor). When multiple modules require access to the same actuator, connections are made to arbitration networks connecting to actuators. Intra-layer connections are all message-based and enable one module to pass its processed data along to another module to generate the behavior provided by that layer. The outputs of modules in a layer can be connected to the inputs of higher layer modules. Inter-layer connections enable modules from higher-level layers to modify the behavior of lower level layers. A module in a higher-level layer can {suppress} a lower-level module's normal (e.g., intra-layer) input with a connection on a lower-level module's input. When the higher-level module sends a message on a suppression output connection, the lower-level module's normal input is replaced, for a temporary interval, by the input from the higher-level module. A higher-level module can also {inhibit} a lower-level module's output. When a higher-level module has an inhibition connection from its output to the output of a lower-level module, and the higher-level module sends a message on that output line, it causes output to be suspended from the lower-level module for a temporarily interval. These mechanisms of inter-layer communication enable the behaviors produced by the layers to be combined. Each module is programmed in an augmented finite-state machine style. These state machines have 1-2 timers, 1-2 internal registers, single message input and output buffers for message connections (to actuators, sensors, and other modules), access to special purpose hardware (e.g., for computing vector sums), and a limited set of named states. When a module starts, it is in the NIL state. Modules have a reset input line which causes them to re-enter the NIL state. Modules have variables that can take on values of Lisp data structures (dynamic allocation is not permitted, however; see p. 172). Each named state is a procedure that performs a computation and a result of the computation is the name of the next state to enter. These state changing procedures also may read from input buffers, send output to an output buffer, change the value of a Lisp variable, and wait for events and time delays. Lisp variables are have scope strictly internal to a module. Data is communicated between modules only via inter-module message connections. Computation proceeds by evaluating the procedure of the current named state and determining the next named state. Brooks' robots utilize sonar range sensors, odometry, 1-bit infrared proximity sensors, robotic hand manipulator sensors (not detailed), a laser light striping system (for 3D depth data), a 1 frame-per-second 256 pixel by 32 pixel high depth camera, force sensors, passive pyroelectric infrared sensors, 4-bit inclinometers, whisker sensors, a light sensor, microphones, a flux gate compass, and 10 frame-per-second low resolution cameras. Actuators in the robots were motors used for driving wheels, turning, for controlling an arm with a hand for picking up soda cans, and for making the six legs of Genghis balance and advance. Behaviors achieved by the higher layers of the control systems of these robots include following moving objects, looking for distant objects and heading toward them, chasing objects but staying within a certain boundary distance, wandering and picking up empty soda cans, darkness seeking followed by moving in the direction of the last noise, and learning a map of behaviors associated with landmarks in order to return to an initial landmark and in order to disambiguate among similar landmarks. The theoretical approach of these chapters has various components. Some of it is based on a thrust away from "traditional" approaches in AI and some of it is based on a thrust towards the innovations presented by Brooks. He suggests that analyzing the problem of constructing an artificial intelligence has been done in the wrong units. Instead of analyzing the problem in terms of independent information processing units such as planning, learning, natural language understanding, and representation, he believes that the correct analysis is in terms of multiple independent and parallel behavior or activity producers, each of which interface directly to the world via sensors and motors. In the subsumption architecture, these behavior or activity producers are the layers of a control system design. He suggests that rather than relying on "abstraction ... to factor out all aspects of perception and motor skills" these skills should be emphasized in artificially intelligent systems because "these are the hard problems solved by intelligent systems" and "the shape of solutions to these problems constrains greatly the correct solutions of the small pieces of intelligence which remain." (p. 82). Examples of reliance on abstraction in more traditional artificially intelligent systems includes pre-analyzing search problems into operators, problem state representation, heuristics, and a goal test, and also the inputs used for theorem provers: a knowledge-base of first-order logic formulas and a query to prove. To see that operators, heuristics etc. are abstractions, think about the problem features that are {not} included in the heuristics, operators etc. It does seem clear that solutions to the perception problem can constrain the design of an intelligent system. Take for example the mathematical investigation into active vision (systems that can alter the positions of their camera sensors) (e.g., Aloimonos, Weiss, & Bandyopadhyay 1988). Active vision, a general solution strategy in robotic perception, reduces the complexity of the perception problem, and is likely to constrain the rest of the solution space for the artificially intelligent system. Brooks' has voiced an important question: What kinds of artificial intelligences can be obtained from architectures that emphasize sensory and motor systems? Brooks' {physical grounding hypothesis} is "that to build a system that is intelligent it is necessary to have its representations grounded in the physical world" (p. 114). This, of course, is reminiscent of symbol grounding (Harnad 1990). The physical grounding hypothesis contrasts sharply with the physical symbol system hypothesis (e.g., Newell & Simon 1976) which states that a physical symbol system is necessary and sufficient for general intelligent action, where a physical symbol system is a collection of physical patterns (symbols) and expressions containing those symbols. Intrinsic to the physical symbol system hypothesis is the idea that reasoning is vital to intelligent systems. Brooks assumes that processes of reasoning and cognition can emerge out of the sensory and action systems constructed with his subsumption architecture. As someone interested in nonhuman animal intelligence, I applaud Brooks' work. While Brooks' theoretical statements tend to be dogmatic, I find it relatively easy to relegate his more rhetorical comments to the category of "energy necessary to achieve critical mass." Given that the physical symbol system hypothesis has traditionally dominated the field of artificial intelligence, Brooks has had his intellectual hands full in trying to make these physical grounding hypothesis ideas float. I strongly believe that the physical grounding hypothesis deserves a place at least equal to the physical symbol system hypothesis in our theorizing in artificial intelligence and cognitive science. I believe this because it seems evident that perception, reasoning, and action are all necessary aspects of various kinds of intelligence. Since the physical symbol system hypothesis for the most part excludes perception and action, the physical grounding hypothesis fills an important gap in our theorizing. I would, however, like to tackle an issue in and amongst Brooks' statements, that I believe is wrongheaded and has some important ramifications. The issue is related to definitions of intelligence and how we view human intelligence as compared to the intelligence of other animal species. Brooks' uses the terms "human level intelligence" (p. 80) and "simpler level intelligence" (p. 80) in his descriptions. The intuition, of course, is that an animal such as a nematode has a simpler level of intelligence than a human. This seems reasonable, given the vast differences in number of neurons. However, this kind of reasoning is generally not supported by modern biology. Non-human animals do not have simpler intelligences than humans, they have {different} intelligences than humans. For example, chimpanzees are {not} animals with a subset of human intelligence. While we undoubtedly share some cognitive skills, chimpanzees also have some cognitive skills that we do not have and vice versa. The evolutionary common ancestor we share with chimpanzees was not a chimpanzee! Rather, past this common ancestral species, we continued to evolve and they continued to evolve (see also Hodos & Campbell 1969). Interestingly, a critique of Brooks' work has followed the same out-moded view of evolution: "possession of concepts in a full-blooded form appears only some way up the evolutionary ladder" (p. 164, Kirsh 1991). The key aspect behind the primacy of human intelligence insofar as artificial intelligence is concerned is that {we} are human and we generally value our intelligence above the intelligences of other species. The ramification of this view is to the degree that humans have cognitive specializations that are specific to humans (i.e., humans have differences in their cognitive skills from other animals), an approach of building "up" to human intelligence may not succeed. Rather, in our artificially intelligent systems, we will likely have to specially construct artificial cognitive skills that are specific to humans. I also wish that Brooks' had contributed more in terms of the issue of implicit-explicit representation. For example, this could have been done in a concluding chapter complimenting the existing preface. Part (and perhaps all) of what Brooks' intends by implicit versus explicit representation (e.g., see chapter 5) is sensor-motor representation versus first-order logic representation. This distinction has the benefit of being concrete, but certainly does not cover all the theoretical ground. It seems clear that some kinds of explicit representation can be useful in mobile robotic systems. Chapter 3 in the book describes the robot Toto that constructs a map of an environment using Brooks' subsumption architecture. Other research (e.g., Thrun 1997) has made more explict use of explict map representations or models in mobile robots, with good results. From my perspective, a main point of Brooks' research is that we do not yet know how to efficiently compute heavy-weight abstractions from first principles and so Brooks' proceeds with light-weight abstractions. By heavy-weight abstraction I mean, for example, the operators, problem state representation, heuristics, and goal tests used as pre-analyzed input into search procedures, and the inputs used for theorem provers: a knowledge-base of first-order logic formulas and a query to prove. I could stick with the term "explicit representation" here, but I think the point is that not that the representations Brooks' is avoiding in his behavior-based robots are "explicit" it is more that they are, as of now, too computationally expensive for first principles (bottom-up) computation. Heavy-weight vs. light-weight emphasizes a degree of processing required (see also Kirsh 1990). Surely the abstractions in maps of environments are heavier-weight than information recently arriving from sensors. Brooks' has laid out a set of abstractions that can be efficiently computed and that are useful in behavior-based robotics. An unanswered question in Brooks' physical grounding hypothesis is whether or not symbol usage can {emerge} in the subsumption architecture. Brooks' might relegate this question to one that is "improper [in a deep sense] to ask" (p. 127) but I think not. Given that Brooks' subsumption architecture achieves certain implicit representations (light-weight abstractions), a fundamental empirical question remains: What kinds of computational architectures can autonomously produce, through emergence or development, symbol usage given that in their initial states, they have only implicit representations? Such architectures may or may not be subsumption architectures. Keeping in mind Brooks' from-first-principles emphasis, we can keep computing somewhat more heavy-weight abstractions on top of what have been developed so far. Hopefully we'll end up, eventually, with abstractions that are equivalent to human symbolic abstraction. Notably lacking from the collection is any coverage of more recent work from MIT, e.g., on {Cog} (e.g., Brooks & Stein 1994) or {Kismet} (e.g., Breazeal & Scassellati 2000). This is unfortunate because one would like to see how these methods have adapted to use with humanoid robots and more particularly, to human behavior modeling. Of course, an integrated treatment of the work would have been nice. As would have detailed module diagrams of the control systems of additional fully autonomous robots. Only the control system from single physically implemented autonomous robot (Genghis) is presented in a detailed module diagram with enough information to determine the modules in each layer. I also found myself wanting an index. More mention would also have been good of their {new} behavior language, which somehow enables groupings of modules into manageable abstract units. {Department of Computer Science}, CHRISTOPHER G. PRINCE {University of Minnesota Duluth}, {Duluth, MN 55812 U.S.A.} {E-mail: chris@cprince.com} Aloimonos, J., Weiss, I., & Bandyopadhyay, A. (1988). 'Active vision', {International Journal of Computer Vision} 1, pp. 333-356. Breazeal, C. & Scassellati, B. (2000). 'Infant-like social interactions between a robot and a human caregiver', {Adaptive Behavior} pp. 8, 49-74. Brooks, R. A. & Stein, L. A. (1994). 'Building brains for bodies', {Autonomous Robots} 1, pp. 7-25. Harnad, S. (1990). 'The symbol grounding problem', {Physica D} 42, pp. 335-346. Hodos, W. & Campbell, C. B. G. (1969). '{Scala naturae:} Why there is no theory in comparative psychology', {Psychological Review} 76, pp. 337-350. Kirsh, D. (1990). 'When is information explicitly represented?', In P. P. Hanson (Ed.), {Information, Language, and Cognition.} Vancouver, BC: UBC Press. Kirsh, D. (1991). 'Today the earwig, tomorrow man?', {Artificial Intelligence}, 47, 161-184. Newell, A. & Simon, H. A. (1976). 'Computer science as empirical inquiry: Symbols and search', {Communications of the ACM} 19, pp. 113-126. Thrun, S. (1997). 'To know or not to know', {AI Magazine} pp. 18, 47-54.