BUILDING AN ARTIFICIAL BRAIN
Using an FPGA Based CAM-Brain Machine



Hugo de Garis (1), Michael Korkin (2), Felix Gers (3),
Eiji Nawa (1), Michael Hough (4)


(1) Brain Builder Group, Evolutionary Systems Department, ATR.
2-2 Hikari-dai, Seika-cho, Soraku-gun, Kyoto-fu, 619-02, Japan.
Tel: (+81) (0)774 95 1079, Fax: (+81) (0)774 95 1008
degaris@hip.atr.co.jp, http://www.hip.atr.co.jp/~degaris

(2) Genobyte Inc.,
1503 Spruce Street, Suite 3, Boulder, Colorado 80302, USA
tel. (+ 1) 303 545 6790, fax. (+ 1) 303 415 1109
korkin@genobyte.com, http://www.genobyte.com

(3) Istituto Dalle Molle di Studi sull'Intelligenza Artificiale (IDSIA),
Corso Elvezia 36, CH-6900, Lugano, Switzerland.
tel. +41 (091) 911 9838, fax. +41 (091) 911 9839
felix@idsia.ch, http://www.idsia.ch/~felix/

(4) Computer Science Department
Stanford University, Stanford, 94305, CA, USA.
mhough@cs.stanford.edu, http://www.stanford.edu/~mhough



ABSTRACT


This paper provides an overview of ATR's Artificial Brain (CAM-Brain) Project, which aims to build an artificial brain containing some billion artificial neurons by the year 2001, and to use this brain to control the behaviors of a robot kitten "Robokoneko".


1. Introduction

This paper provides an overview of ATR's Artificial Brain (CAM-Brain) Project, which at the time of writing (September 1998), has been running for nearly 6 years. The basic aim of the project as first stated in 1993 was to build an artificial brain containing a billion artificial neurons by the year 2001. It looks as though the actual figure will be more like 40 million, but the billion figure is still reachable. One of the highlights of this project will be the delivery (probably January of 1999) of the so-called "CAM-Brain Machine" (CBM) [3], an FPGA (Field Programmable Gate Array) based, programmable logic device which grows and evolves cellular automata based neural network modules at electronic speeds, and will update the states of an artificial brain consisting of tens of thousands of such modules in real time by updating their 3D cellular automata (CA) cells at the rate of 150 Billion a second. This speed is such that it will be possible to evolve these modules in about a second each (i.e. a complete run of a genetic algorithm, with tens of thousands of circuit growths and circuit fitness (performance) evaluations).

The ATR Brain Builder team expects that the CBM will revolutionize the field of neural networks (by creating neural systems with tens of millions of artificial neurons, rather than just the conventional tens to hundreds, as has been the custom up to now), and will create a new research field called "Brain Building". The CBM will make practical the creation of artificial brains, which are defined to be assemblages of tens of thousands (and higher magnitudes) of evolved neural net modules into humanly defined artificial brain architectures. An artificial brain will consist of a large RAM memory space, into which individual CA modules are downloaded once they have been evolved. The CA cells in this RAM will be updated by the CBM fast enough for real time control of a robot kitten "Robokoneko" (Japanese for "robot, child, cat"). Since the neural net model used to fit into state-of-the-art evolvable electronics has to be simple, the signaling states of the neural net were chosen to be 1 bit binary. We label this model "CoDi-1Bit"[2]. Some examples of evolved modules, with different functionalities, are included in this paper, to give a flavor of what "CoDi-1Bit" modules can do.

The remainder of this paper is structured as follows. Section 2 gives a brief explanation of the CoDi-1Bit neural net model that is implemented by the CAM-Brain Machine (CBM). Section 3 provides a detailed summary of the CBM design and functionality. Section 4 describes the representation (SIIC = Spike Interval Information Coding) that our team has chosen to interpret the 1 bit signals which are input to and output from the CoDi modules. Section 5 shows how analog time-dependent signals can be converted into spike trains (bit strings of 0s and 1s) to be input into CoDi modules (HSA = Hough Spiker Algorithm). The SIIC (spiketrain to analog signal conversion) and the HSA (analog signal to spiketrain conversion) allow users (EEs = evolutionary engineers) to think entirely in analog terms when specifying input signals and target (desired) output signals, which is much easier than thinking in terms of spike intervals (the number of 0s between the 1s). Section 6 introduces early work on the behavioral repertoire and mechanical design of the kitten robot "Robokoneko" that a future artificial brain will control. Section 7 presents a sample of what evolved CoDi modules can do. Section 8 discusses ideas for interesting future modules to be evolved. Section 9 concludes.


2. The CoDi-1Bit Neural Network Model

The CBM implements the so called "CoDi" (i.e. Collect and Distribute) [2] cellular automata based neural network model. It is a simplified form of an earlier model developed at ATR HIP (Kyoto, Japan) in the summer of 1996, with two goals in mind. One was to make neural network functioning much simpler and more compact compared to the original ATR HIP model, so as to achieve considerably faster evolution runs on the CAM-8 (Cellular Automata Machine), a dedicated hardware tool developed at Massachusetts Institute of Technology in 1989.

In order to evolve one neural module, a population of 30-100 modules is run through a genetic algorithm for 200-600 generations, resulting in up to 60,000 different module evaluations. Each module evaluation consists of - firstly, growing a new set of axonic and dendritic trees, guided by the module's chromosome (which provide the growth instructions for the trees). These trees interconnect several hundred neurons in the 3D cellular automata space of 13,824 cells (24*24*24). Evaluation is continued by sending spiketrains to the module through its efferent axons (external connections) to evaluate its performance (fitness) by looking at the outgoing spiketrains. This typically requires up to 1000 update cycles for all the cells in the module.

On the MIT CAM-8 machine, it takes up to 69 minutes to go through 829 billion cell updates needed to evolve a single neural module, as described above. A simple "insect- like" artificial brain has hundreds of thousands of neurons arranged into ten thousand modules. It would take 500 days (running 24 hours a day) to finish the computations.

Another limitation was apparent in the full brain simulation mode, involving thousands of modules interconnected together. For a 10,000-module brain, the CAM-8 is capable of updating every module at the rate of one update cycle 1.4 times a second. However, for real time control of a robotic device, an update rate of 50-100 cycles per module, 10-20 times a second is needed. So, the second goal was to have a model which would be portable into electronic hardware to eventually design a machine capable of accelerating both brain evolution and brain simulation by a factor of 500 compared to CAM-8.

The CoDi model operates as a 3D cellular automata (CA). Each cell is a cube which has six neighbor cells, one for each of its faces. By loading a different phenotype code into a cell, it can be reconfigured to function as a neuron, an axon, or a dendrite. Neurons are configurable on a coarser grid, namely one per block of 2*2*3 CA cells. Cells are interconnected with bidirectional 1-bit buses and assembled into 3D modules of 13,824 cells (24*24*24).

Modules are further interconnected with 92 1-bit connections to function together as an artificial brain. Each module can receive signals from up to 92 other modules and send its output signals to up to 32,768 modules. These intermodular connections are virtual and implemented as a cross-reference list in a module interconnection memory (see below).

In a neuron cell, five (of its six) connections are dendritic inputs, and one is an axonic output. A 4-bit accumulator sums incoming signals and fires an output signal when a threshold is exceeded. Each of the inputs can perform an inhibitory or an excitatory function (depending on the neuron's chromosome) and either adds to or subtracts from the accumulator. The neuron cell's output can be oriented in 6 different ways in the 3D space. A dendrite cell also has five inputs and one output, to collect signals from other cells. The incoming signals are passed to the output with an 5-bit XOR function. An axon cell is the opposite of a dendrite. It has 1 input and 5 outputs, and distributes signals to its neighbors. The "Collect and Distribute" mechanism of this neural model is reflected in its name "CoDi". Blank cells perform no function in an evolved neural network. They are used to grow new sets of dendritic and axonic trees during the evolution mode.

Before the growth begins, the module space consists of blank cells. Each cell is seeded with a 6-bit chromosome. The chromosome will guide the local direction of the dendritic and axonic tree growth. Six bits serve as a mask to encode different growth instructions, such as grow straight, turn left, split into three branches, block growth, T- split up and down etc. Before the growth phase starts, some cells are seeded as neurons at random locations. As the growth starts, each neuron continuously sends growth signals to the surrounding blank cells, alternating between "grow dendrite" (sent in the direction of future dendritic inputs) and "grow axon" (sent towards the future axonic output). A blank cell which receives a growth signal becomes a dendrite cell, or an axon cell, and further propagates the growth signal, being continuously sent by the root neuron, to other blank cells. The direction of the propagation is guided by the 6-bit growth instruction, described above. This mechanism grows a complex 3D system of branching dendritic and axonic trees, with each tree having one neuron cell associated with it. The trees can conduct signals between the neurons to perform complex spatio-temporal functions. The end-product of the growth phase is a phenotype bitstring which encodes the type and spatial orientation of each cell.

Thus there are two main phases - neural net growth and neural net signaling. In the CoDi-1Bit model, the signal states contain only 1 bit, so an interpretation problem arises. With an 8 bit signal for example (as was the case in the old CAM-Brain Project model) one simply looks at the signal state to see the signal value. With 1 bit signaling, one needs to choose an interpretation of the signals, e.g. frequency based (count the number of spikes (1s) in a given time), or interpret the spacing between the spikes as containing information etc. These interpretation issues will be taken up in section 4.


3. The CAM-Brain Machine (CBM)


3.1 CBM Overview

The CAM-Brain Machine (CAM stands for Cellular Automata Machine) is a research tool for the simulation of artificial brains. An original set of ideas for the CAM-Brain project was developed by Dr. Hugo de Garis at the Evolutionary Systems Department of ATR HIP (Kyoto, Japan), and is currently being implemented as a dedicated research tool by Genobyte, Inc. (Boulder, Colorado). Genobyte is licensed by ATR International and Japan's Key Technologies Center to manufacture and sell CBMs to third parties.

An artificial brain, supported by the CBM, consists of up to 32,768 neural modules, each module populated with up to 1,152 neurons, a total of 37.7 million neurons. Within each neural module, neurons are densely interconnected with branching dendritic and axonic trees in a three-dimensional space, forming an arbitrarily complex interconnection topology. A neural module can receive afferent axons from up to 180 other modules of the brain, with each axon being capable of multiple branching in three dimensions, forming hundreds of connections with dendritic branches inside the module. Each module sends efferent axon branches to up to 32,768 other modules.

A critical part of the CBM approach is that neural modules are not "manually designed" or "engineered" to perform a specific brain function, but rather evolved directly in hardware, using genetic algorithms.

Genetic algorithms operate on a population of chromosomes, which represent neural networks of different topologies and functionalities. Better performers for a particular function are selected and further reproduced using chromosome recombination and mutation. After hundreds of generations, this approach produces very complex neural networks with a desired functionality. The evolutionary approach can create a complex functionality without any a priori knowledge about how to achieve it, as long as the desired input/output function is known.


3.2 CBM Architecture

The CBM consists of the following six major blocks:

1) Cellular Automata Module
2) Genotype/Phenotype Memory
3) Fitness Evaluation Unit
4) Genetic Algorithm Unit
5) Module Interconnection Memory
6) External Interface

Each of these blocks is discussed in detail below.


3.2.1 Cellular Automata Module

The cellular automata module is the hardware core of the CBM. It is intended to accelerate the speed of brain evolution through a highly parallel execution of cellular state updates. The CA module consists of an array of identical hardware logic circuits or cells arranged as a 3D structure of 24*24*24 cells (a total of 13,824 cells). Cells forming the top layer of the module are recurrently connected with the cells in the bottom layer. A similar recurrent connection is made between the cells on the north and south, east and west vertical surfaces. Thus a fully recurrent toroidal cube is formed. This feature allows a higher axonic and dendritic growth capacity by effectively doubling each of the three dimensions of the cellular space.

The CBM hardware core is time-shared between multiple modules forming a brain during brain simulation. Only one module is instantiated at a time. The FPGA firmware design is a dual-buffered structure, which allows simultaneous configuration of the next module while the current module is being run (i.e. signals are propagated through the dendrites and axons between neurons). Thus, the FPGA core is run continuously without any idle time between modules for reconfiguration.

The surfaces of the cube have external connections to provide signal input from other modules. Each surface has a matrix of 60 signals, which is repeated on the opposite surface due to wrap around connections. Thus, a total of 180 different connections is available. Three connections, one on each of the surfaces, are used as output points.

The CA module is implemented with new Xilinx FPGA devices XC6264. These devices are fully and partially reconfigurable, feature a new co-processor architecture with data and address bus access in addition to user inputs and outputs, and allow the reading and writing of any of the internal flip-flops through the data bus. An XC6264 FPGA contains 16384 logic function cells, each cell featuring a flip-flop and Boolean logic capacity, capable of toggling at a 220 MHz rate. Logic cells are interconnected with neighbors at several hierarchical levels, providing identical propagation delay for any length of connection. This feature is very well suited for a 3D CA space configuration. Additionally, clock routing is optimized for equal propagation time, and power distribution is implemented in a redundant manner.

To implement the CA module, a 3D block of identical logic cells is configured inside each XC6264 device, with CoDi specified 1-bit signal buses interconnecting the cells. Given the FPGA internal routing capabilities and the logic capacity needed to implement each cell, the optimal arrangement for a XC6264 is 4*6*8 (192 cells). This elementary block of cells requires 208 external connections to form a larger 3D block by interconnecting with six neighbor FPGAs on the south, north, east, west, top, and bottom sides in a virtual 3D space. A total of 72 FPGAs, arranged as a 6*4*3 array are used to implement a 24*24*24 cellular cube.

The CBM implements interconnections between 72 FPGAs, each placed on a small individual printed circuit board, in the form of one large backplane board, carrying all 72 FPGA daughter boards.

The CBM clock rate for cellular update is selected between 8.25 MHz, 9.42 MHz, and 11 MHz. At this rate all 13,824 cells are updated simultaneously, which results in the update rate of 114 to 152 billion cells/s. This rate exceeds the CAM-8 update rate by a factor of 570 to 751 times.


3.2.2 Genotype and Phenotype Memory

Each of the 72 FPGA daughter boards includes 16 Mbytes of EDO DRAM to be used for storing the genotypes and phenotypes of the neural modules, a total of 1,180 Mbytes. There are two modes of CBM operation, namely evolution mode and run mode. The evolution mode involves the growth phase and signaling phase. During the growth phase, memory is used to store the chromosome bitstrings of the evolving population of modules (module genotypes). For a module of 13,824 cells there are over 91 Kbits of genotype memory needed. For each module the genotype memory also stores information concerning the locations and orientations of the neurons inside the module, and their synaptic masks.

During the run mode, memory is used as a phenotype memory for the evolved modules. The phenotype data describes the grown axonic and dendritic trees and their respective neurons for each module. The phenotype data is loaded into the CA module to configure it according to the evolved function. The genotype/phenotype memory is used to store and rapidly reconfigure (reload) the FPGA hardware CA module. Reconfiguration can be performed in parallel with running the module, due to a dual pipelined phenotype/genotype register provided in each cell. This guarantees the continuous running of the FPGA array at full speed with no interruptions for reloading in either evolution or run modes. The phenotype/genotype memory can support up to 32,758 interconnected neural modules at a time. An additional memory will be based in the main memory of the host computer (Pentium-Pro 300 MHz) connected to the CBM through a PCI bus, capable of transferring data at 132 Mbytes/s.


3.2.3 Fitness Evaluation Unit

Signaling in the CBM is accomplished with 1-bit spiketrains, a sequence of ones separated by intervals of zeros, similar to those of biological neural networks. Information, representing external stimuli, as well as internal waveforms, is encoded in spiketrains using a so-called "Spike Interval Information Coding (SIIC)". This method of coding is implemented by nature in animal neural networks, and is very efficient in terms of information capacity per spike. Conversion from spiketrains into "analog" waveforms representing external stimuli, or internal signaling, is accomplished by convolving the spiketrain with a special multi-tap linear filter.

When a module is being evolved, it must be evaluated in terms of it's fitness for a targeted task. During the signaling phase, each module receives up to 180 different spiketrains, and produces up to three different output spiketrains, which is compared with a target array of spiketrains in order to guide the evolutionary process. This comparison gives a measure of performance, or fitness, of the module.

Fitness evaluation is supported by a hardware unit which consists of an input spiketrain buffer, a target spiketrain buffer, and a fitness evaluator. During each clock cycle an input vector is read from its stack and fed into the module's inputs. At the same time, a target vector is read from its buffer to be compared with the current module outputs by the evaluator. The fitness evaluator performs a convolution of the spiketrains with the convolution filter, and computes the sum of the waveform's absolute deviations for the duration of the signaling phase. At the end of the signaling phase, a final measure of the module's fitness is instantly available.


3.2.4 Genetic Algorithm Unit

To evolve a module, a population of modules is evaluated by computing every module's fitness measure, as described above. A subset of the best modules are then selected for further reproduction. In each generation of modules, the best are mated and mutated to produce a set of offspring modules to become the next generation. Mating and mutation is performed by the CBM hardware core at high speed, configured for the genetic phase. During this phase, each cell's firmware implements crossover and mutation masks, two parent registers and an offspring register. Thus, each offspring chromosome is generated in nanoseconds, directly in hardware. The selection algorithm is performed by the host computer in software, using access to the CBM via a PCI interface.


3.2.5 Module Interconnection Memory

In order to support the run mode of operation, which requires a large number of evolved modules to function as one artificial brain, a module interconnection memory is provided. Each module can receive inputs from up to 180 other modules. A list of these source modules referenced to each module is stored in a CBM cross-reference memory (3 Mbytes) by the host computer. This list is compiled by CBM software using a module interconnection netlist in EDIF format. This netlist reflects the module interconnections as designed by the user, using off-the-shelf schematic capture tools.

The length of module interconnections is 96 cells (clock cycles). For each of the 32,768 modules, a Signal Memory stores up to three 96-bit long output spiketrains.

During the run mode, at the time each module of a brain is configured in the CA hardware core (by loading its phenotype), a signal input buffer is also loaded with up to 180 spiketrains according to the netlist in the module interconnection memory. The spiketrains are the signals saved from the previous instantiation and signaling of the 180 sourcing modules. At the same time, the three output spiketrains of the currently instantiated module are saved back to the Signal Memory. This repetitive cycling through all the modules which form the brain, results in a repetitive saving and retrieving of the spiketrains to/from the Signal Memory. It provides the signaling between modules according to the brain interconnection structure reflected in the schematics, designed by the user.

In a maximum brain with 32,768 modules, the CBM update rate is such that each cell propagates approximately 288 bit-long spiketrains per second. A 288 bit-long spiketrain can carry on the order of 12 bytes of signal information, using the SIIC coding method. Each neuron receives up to 5 spiketrains, so there are up to 188 million spiketrains being processed by neurons in the brain. Thus the maximum information processing rate by all neurons in the brain is of the order of 2.25 Gbytes/s.

Additional spiketrain processing in multiple dendritic branches can be estimated by assuming 50% of the total cellular space to be occupied by dendrite cells, each cell on average having 2.5 branches out of 5 possible. Informational throughput of dendrite cells is then of the order of 6.8 Gbyte/s.


3.2.6 External Interface

The CBM architecture can receive and send spiketrains not only from/to the Signal Memory, but also from/to the external CBM interface. Any module can receive up to 188 incoming spiketrains and send up to 3 spiketrains to an external device, such as a robot, a speech processing system, etc. In a brain with 16,384 modules, the information rate, as measured at the external interface is up to 4.5 Kbytes/s per each module, or up to 74 Mbyte/s overall. In a smaller brain with less number of modules, the external information rate is higher, for example, a brain with 4,000 modules provides quadruple the external information rate for each module (18 Kbyte/s).


3.3 Summary of CBM Technical Specifications

Cellular Automata Update Rate (max.) ------------------------ 152 billion cells/s.
Cellular Automata Update Rate (min.) ------------------------ 114 billion cells/s.
Number of Supported Cellular Automata Cells (max.) --------------- 453 million
Number of Supported Neurons (max., per module) ----------------- 1,152.
Number of Supported Neurons (max., per brain) ------------------ 37,748,736.
Number of Supported Neural Modules ------------------------ 32,768.
Information Flow Rate, Neuronal Level (max.) ------------------- 2.25 Gbytes/s.
Information Flow Rate, Dendrite Level (estimated average) ----------- 6.8 Gbytes/s.
Information Flow Rate, Intermodular Level (max.) ----------------- 74 Mbytes/s.
Number of FPGAs ------------------------------------- 72.
Number of FPGA Reconfigurable Function Units ------------------ 1,179,648.
Phenotype/Genotype Memory ------------------------------ 1.18 Gbytes.
Chromosome Length ------------------------------------ 91,008 bits.
Power Consumption ------------------------------------ 1 KWatt (5 V, 200 A).


4. The Spike Interval Information Coding Representation, "SIIC"

Choosing a Representation for the CoDi-1Bit Signaling

The constraints imposed by state-of-the-art programmable (evolvable) FPGAs in 1998 are such that the CA based model (the CoDi model) had to be very simple in order to be implementable within those constraints. Consequently, the signaling states in the model were made to contain only 1 bit of information (as happens in nature's "binary" spike trains). The problem then arose as to interpretation. How were we to assign meaning to the binary pulse streams (i.e. the clocked sequences of 0s and 1s which are a neural net module's inputs and outputs? We tried various ideas such as a frequency based interpretation, i.e. count the number of pulses (i.e. 1s) in a given time window (of N clock cycles). But this was thought to be too slow. In an artificial brain with tens of thousands of modules which may be vertically nested to a depth of 20 or more (i.e. the outputs of a module in layer "n" get fed into a module in layer "n+1", where "n" may be as large as 20 or 30) then the cumulative delays may end up in a total response time of the robot kitten being too slow (e.g. if you wave your finger in front of its eye, it might react many seconds later). We wanted a representation that would deliver an integer or real valued number at each clock tick, i.e. the ultimate in speed. The first such representation we looked at we called "unary" i.e. if N neurons on an output surface are firing at a given clock tick, then the firing pattern represented the integer N, independently of where the outputs were coming from. We found this representation to be too stochastic, too jerky. Ultimately we chose a representation which convolves the binary pulse string with the convolution function shown in Fig.2. We call this representation "SIIC" (Spike Interval Information Coding) which was inspired by [5]. This representation delivers a real valued output at each clock tick, thus converting a binary pulse string into an analog time dependent signal. Our team has already published many papers on the results of this convolution representation work [4]. Figs. 3 and 4 show some results of CoDi modules which were evolved to output oscillatory signals (using the convolutionary interpretation). Fig. 5 show the evolved output for a random target analog signal. We thought the results were good enough to settle on this representation. The CBM will implement this representation in the FPGAs when measuring fitness values at electronic speeds.


The SIIC Convolution Algorithm

The convolution algorithm we use takes the output spiketrain (a bit string of 0s and 1s), and runs the pulses (i.e. the 1s) by the convolution function shown in the simplified example below. The output at any given time t is defined as the sum of those samples of the convolution filter that have a 1 in the corresponding spiketrain positions. The example below should clarify what is meant by this.

Simplified Example :

Convolve the spiketrain 1101001 (where the left most bit is the earliest, the right most bit, the latest) using the convolution filter values { 1 4 9 5 -2 }. The spiketrain in this diagram moves from left to right across the convolution filter. Alternatively, one can view the convolution filter (window) moving across the spiketrain. The number to the right of the colon shows the value of the convolution sum at each time "t".
time-shifted spiketrain :  1 0 0 1 0 1 1 ---> (moves left to right)

      conv. filter :   1  4  9  5 -2

  1  0  0  1  0  1  1  
                       0  0  0  0  0  : 0    t = -1

     1  0  0  1  0  1  1  
                       1  0  0  0  0  : 1    t = 0

        1  0  0  1  0  1  1  
                       1  4  0  0  0  : 5    t = 1

           1  0  0  1  0  1  1  
                       0  4  9  0  0  : 13   t = 2

              1  0  0  1  0  1  1  
                       1  0  9  5  0  : 15   t = 3

                 1  0  0  1  0  1  1
                       0  4  0  5 -2  : 7    t = 4

                    1  0  0  1  0  1  
                       0  0  9  0 -2  : 7    t = 5

                       1  0  0  1  0
                       1  0  0  5  0  : 6    t = 6

                          1  0  0  1
                       0  4  0  0 -2  : 2    t = 7

                             1  0  0
                       0  0  9  0  0  : 9    t = 8

                                1  0
                       0  0  0  5  0  : 5    t = 9

                                   1
                       0  0  0  0 -2  : -2   t = 10


Hence, the time-dependent output of the convolution filter takes the values (0,1,5,13,15,7,7,6,2,9,5,-2). This is a time varying analog signal, which is the desired result.


5. The "Hough Spiker Algorithm" (HSA) for Deconvolution

Section 4 above explained the use of the SIIC (Spike Interval Information Coding) Representation which provides an efficient transformation of a spike train (i.e. string of bits) into a (clocked) time varying "analog" signal. We need this interpretation in order to interpret the spike train output from the CoDi modules to evaluate their fitness values (e.g. by comparing the actual converted analog output waveforms with user specified target waveforms). However, we also need the inverse process, i.e. an algorithm which takes as input, a clocked (digitised, i.e. binary numbered) time varying "analog" signal, and outputs a spike train. This conversion is needed as an interface between the motors/sensors of the robot bodies (e.g. a kitten robot) that the artificial brain controls, and the brain's CoDi modules. However, it is also very useful to users, i.e. EEs (evolutionary engineers) to be able to think entirely in terms of analog signals (at both the inputs and outputs) rather than in abstract, visually unintelligible spiketrains. This will make their task of evolving many CoDi modules much easier. We therefore present next an algorithm which is the opposite of the SIIC, namely one which takes as input, a time varying analog signal, and outputs a spike train, which if later is convoluted with the SIIC convolution filter, should result in the original analog signal.

A brief description of the algorithm used to generate a spiketrain from a time varying analog signal is now presented. It is called the "Hough Spiker Algorithm" (HSA) and can be viewed as the inverse of the convolution algorithm described above in section 4.

To give an intuitive feel for this deconvolution algorithm, consider a spiketrain consisting of a single pulse (i.e. all 0s with one 1). When this pulse passes through the convolution function window, it adds each value of the convolution function to the output in turn.

A single pulse: ( 1 0 0 0 0 0 --> t=+inf ) will be convoluted with the convolution function expressed as a function of time. At t=0 its value will be the first value of the convolution filter, at t=1 its value will be the second value of the convolution filter, etc. Just as a particular spiketrain is a series of spikes with time delays between them, so too the convolved spiketrain will be the sum of the convolution filters, with (possibly) time delays between them. At each clock tick when there is a spike, add the convolution filter to the output. If there is no spike, just shift the time offset and repeat.

The same example.
1101001
x 1 4 9 5 -2

    t -> 0  1  2  3  4  5  6  7  8  9 10                      
out:
 1       1  4  9  5 -2
 1          1  4  9  5 -2
 0             0  0  0  0  0
 1                1  4  9  5 -2
 0                   0  0  0  0  0        
 0                      0  0  0  0  0
 1                         1  4  9  5 -2
         -------------------------------
         1  5 13 15  7  7  6  2  9  5 -2


  In the HSA deconvolution algorithm, we take advantage of this summation, and in effect do the reverse, i.e. a kind of progressive subtraction of the convolution function.  If at a given clock tick, the values of the convolution function are less than the analog values at the corresponding positions, then subtract the convolution function values from the analog values. The justification for this is that for the analog values to be greater than the convolution values, implies that to generate the analog signal values at that clock tick, the CoDi module must have fired at that moment, and this firing contributed the set of convolution values to the analog output. Once one has determined that at that clock tick, there should be a spike, one subtracts the convolution function's values, so that a similar process can be undertaken at the next clock tick. For example, to deconvolve the convolved output (using the same vlaue of the convolution function as in the simple example of the previous section :

         1  5 13 15  7  7  6  2  9  5 -2
compare: 1  4  9  5 -2                     conv.vals

  It is assumed that spiking will irreversibly raise the value of the convolved output. If the convolution filter value at a given clock tick is less than that of the target waveform, spiking will bring the two values closer together. If the waveform value is still too low after a spike has occurred, a near future spike will bring the two closer together.

Below is the C code for the algorithm. "DUR" is the duration of the spiketrain (the number of bits), "error" is the sum of the differences of the convolution value and waveform value at each clock tick, "N_CONV" is the number of elements (integers) in the convolution function, "wave" stores the value of the waveform at each clock tick, "bits" stores the spiking history (0 or 1) for each clocktick, thresh is the threshold value to decide when error is small enough for a spike, "conv_fn" contains the values of the convolution function.


void deconv(int wave[ ], char bits[ ], int thresh)
{
int i,j,ni;
int error;

for(i=0;i< DUR;i++) {
error=0;
for(j=0;j< N_CONV;j++) {
if(i+j>DUR) break;
if(wave[i+j]< conv_fn[j]) error+=conv_fn[j]-wave[i+j];
}

/* if error is okay, SPIKE! */
if(error< thresh) {
bits[i]=1;
for(j=0;j< N_CONV;j++) {
if(i+j>DUR) break;
wave[i+j]-=conv_fn[j];
}
}
else bits[i]=0;
}
}
Figs 5 and 6 show two examples of the HSA in action. The original input analog signals "Oi" are shown as stippled lines. The spiketrain resulting from each analog input is sent into the SIIC convolutor (shown in Fig. 1). The resulting analog output should be very close to the original "Oi" and is shown as a solid line. The third line near the bottom is the absolute difference (error) between the two analog signals. The HSA seems to work well when the values of the waveforms are large and do not take values close to zero, and do not change too quickly relative to the time width of the convolution filter window. An example of applying too stringent an analog wave form is shown in Fig 7. It may be possible to simply add a constant value to incoming analog signals before spiking them and to ensure that the analog signal does not change too rapidly.

Note however, that the HSA deconvolution algorithm was only discovered very recently, so the neural net module evolution that is discussed in section 7, does not use it. The inputs to these modules as specified by the EE (evolutionary engineer) were in binary, not analog.


6. "Robokoneko", the Kitten Robot

An artificial brain with nothing to control is rather useless, so we chose a controllable object that we thought would attract a lot of media attention, i.e. a cute life-size robot kitten that we call "Robokoneko" (which is Japanese for "robot-child-cat"). We did this partly for political and strategic reasons. Brain building is still very much in the "proof of concept" phase, so we want to show the world something that is controlled by an artificial brain, that would not require a PhD to understand what it is doing. If the kitten robot can perform lots of interesting actions, this will be obvious to anyone simply by observation. The more media attention the kitten robot gets, the more likely our brain building work will be funded beyond 2001 (the end of our current research project).

Fig. 4 shows the mechanical design our team has chosen for the kitten robot. Its total length is about 25 cms, hence roughly life size. Its torso has two components, joined with 2 degrees of freedom (DoF) articulation. The back legs have 1 DoF at the ankle and the knee, and 2 DoF at the hip. All 4 feet are spring loaded between the heel and toe pad. The front legs have 1 DoF at the knee, and 2 DoF at the hip. With one mechanical motor per DoF, that makes 14 motors for the legs. 2 motors are required for the connection between the back and front torso, 3 for the neck, 1 to open and close the mouth, 2 for the tail, 1 for camera zooming, giving a total of 23 motors.

In order to evolve modules which can control the motions of the robot kitten, we thought it would be a good idea to feed back the state of each motor (i.e. a spiketrain generated from the pulse width modulation PWM output value of the motor) into the controlling module. Since each module can have up to 188 inputs, feeding in these 23 motor state values will be no problem. We are thinking we may install acceleromotors and/or gyroscopes which may add another 6 or more inputs to each motion control module. It can thus be seen that the mechanical design of the kitten robot has implications on the design of the CBM modules. There need to be sufficient numbers of inputs for example.

The motion control modules will not be evolved directly using the mechanical robot kitten. This would be hopelessly slow. Mechanical fitness measurement is impractical for our purposes. Instead we will soon be simulating the kitten's motions using an elaborate commercial simulation software package called "Working Model - 3D". This software will allow input from an evolving module to control the simulated motors of the simulated kitten. This software simulation approach negates to some extent the philosophy of the CAM-Brain Machine and the CAM-Brain Project, i.e. the need for hardware evolution speeds. This compromise was felt to be a necessary evil. In practice, the proportion of modules concerned with motion control will be very small compared to the total. Potentially, we have 32K modules to play with. Probably most of them will be concerned with pattern recognition, vision, audition, etc. and decision making. Designing the kitten robot artificial brain remains the greatest research challenge of the CAM-Brain Project and will occupy us through 1998 and 1999, and possibly beyond.


7. A Sampler of CoDi-1Bit Evolved Neural Net Modules

Since the whole point of using the CBM is to attain a high evolution speed, it is useful if the representation chosen to interpret the 1 bit signals which enter and leave the CoDi modules can be unique, otherwise several representations would need to be implemented in the electronics. (For the CBM to be efficient, i.e. to evolve CoDi modules in about 1 second, fitness measurements need to be performed at electronic speeds, which implies that the representation chosen for the signals be implemented directly in the hardware). We chose the SIIC to be our unique representation. However, as mentioned at the bottom of section 5, most of the evolutionary experiments presented here were already undertaken before the SIIC representation was chosen. Since the results of these earlier experiments are interesting in their own right, we report on them here. They show to what extent that CoDi modules are evolvable and the power of their functionality. The evolution of SIIC-representation-based and HSA-based modules will be the subject of work in the very near future, given that both algorithms are now ready. So is the CBM multi-module simulation code, so progress should be rather rapid in the coming months prior to the delivery of the CBM itself. Once the CBM is delivered, multi-module systems should be built as fast as we can dream them up. The bottleneck in building large scale multi-module systems will become human creativity lag, not module evolution lag (as was the case with software evolution speeds in the "pre-CBM era".) We now provide a sample of evolved CoDi neural net modules, their specified functionalities, and their actual performances, to give a feel for what they can do.


XOR Module

If a CoDi module could not be evolved to perform something as simple as an exclusive OR, then the whole CAM-Brain approach would be cast in doubt, so one of the first things we tried was to evolve an XOR module. The module size was a cube of 24*24*24 3D CA cells (the standard CBM module size, as implemented in the hardware). Two binary signals, A and B (on the Z=0 face of the cube at (x,y,z) coordinates of (1,1,0) and (7,1,0) (axon cells)) were input over 64 clock ticks. There were 4 test cases (using a multi-test fitness measurement).

Case 1:
The A input was a steady stream of 1s for 64 clocks, and
the B input was a steady stream of 1s for 64 clocks.

Case 2:
The A input was a steady stream of 1s for 64 clocks, and
the B input was a steady stream of 0s for 64 clocks.

Case 3:
The A input was a steady stream of 0s for 64 clocks, and
the B input was a steady stream of 1s for 64 clocks.

Case 4:
The A input was a steady stream of 0s for 64 clocks, and
the B input was a steady stream of 0s for 64 clocks.

At the output point (4,4,0), the desired or target output values were a stream of 0s, 1s, 1s and 0s respectively (to implement the XOR function). The total fitness value was the sum of the partial fitness values of the 4 test cases, i.e. the total number of correct output values, giving a theoretical maximum fitness value of 4*64 = 256. However, since there are inevitable delays as the signals progress through the CA cells of the module, this perfect score is not possible.

The elite module, (population of 15, mutation rate of 0.005 per bit per generation, one point crossover rate of 0.6 per chromosome per generation), gave after about 20 generations, the following outputs for the 4 cases, with partial fitness scores of 63, 57, 56, 64 respectively. Hence total fitness was 240.

Case 1:
0000000100000000000000000000000000000000000000000000000000000000

Case 2:
0000000111111111111111111111111111111111111111111111111111111111

Case 3:
0000000011111111111111111111111111111111111111111111111111111111

Case 4:
0000000000000000000000000000000000000000000000000000000000000000

So the XOR module evolved fairly quickly and easily. Note that the XOR case is an example of "multi-test fitness measurement". For each test, a partial fitness score is obtained, which is later summed with other partial fitess scores, resulting in the overall fitness score of the CoDi module. Between each partial test, the signal states of the module are cleared, and a new test run is performed using the same circuit, but with different input. The same story occurs in the CBM hardware.


Timer Module

One of the first things we tested after the CoDi-1Bit model was invented, was to see if a CoDi module (consisting of 4K 3D CA cells, with about 150 artificial neurons in the 4K space, could evolve a "timer", i.e. where constantly firing binary inputs generate at a single cell output (placed elsewhere in the CA space) a string of 0s during the first 30 clock ticks, then a string of 1s during the next 20 clock ticks, and finally a string of 0s in the last 20 clocks, as shown below. This was quite a demanding evolutionary task, which was a useful test vehicle during our early evolutionary trials.

000000000000000000000000000000 11111111111111111111 00000000000000000000

The fitness definition was pretty simple. (Note, that the evolution of this module did not use the SIIC or HSA approaches. Inputs and outputs were specified directly in binary signals). If a 0 appeared in the first (0) block, score one point. If a 1 appeared in the second (1) block, score three points. If a 0 appeared in the third block (0), score two points. Hence a perfect score would be 30*1 + 20*3 + 20*2 = 130. Population size was 24, with no crossover.

The Codi-1Bit model software simulation evolved this with a fitness of 100% in about 150 generations. A few days later, by strongly increasing the neuron density in the CA space to about 90% of the maximum value, we got the same result in about 50 generations!


Multiple Timer Module

Since a 100% fitness score does not test the limits of evolvability of a module, a more demanding output function was tried. The target output (similar to the above pattern) and the actual evolved output (placed immediately under the target pattern for comparison) were as follows -

000000000000000000000000000000 11111111111111111111 -> (Target)
000000000000000000000000000000 00011111111111111111 -> (Evolved)

-> 000000000000000000000000 1111111111111111 00000000000000000000 (Target ctd.)
-> 100000000000000000000000 0111111111111111 10000000000000000000 (Evolved ctd.)

The fitness definition was similar to the above. If a 0 appeared in the first (0) block, score 12 points. If a 1 appeared in the second (1) block, score 7 points. If a 0 appeared in the third block (0), score 3 points. If a 1 appeared in the fourth block (1), score 2 points. If a 0 appeared in the fifth block (0), score 1 point. Hence a perfect score would be 30*12 + 20*7 + 24*3 + 16*2 + 20*1 = 624. These weightings were chosen so as to encourage the earlier outputs to be correct before the later outputs. Population size was 30. No crossover. This result converged after about 100 generations with a fitness value of 0.957

It is interesting to note that these good results are evolving in about 100 generations or so, and yet the chromosome length is very large. The standard CBM chromosome length is of the order of 90K bits. One might think that such a long chromosome would be very slow in evolving, but this was not the case. One possible explanation for this is that there may be so many possible solutions, that (any reasonable) one is quickly found.


Switchable Dual Function Module

Our thoughts then turned to the idea of trying to evolve a module whose behaviors could be placed under switchable control, i.e. a module with dual functionality, which could be switched from one behavior to the other, depending on whether a "control" input was activated or not.

More specifically, two fixed position input points "IN" and "SWITCH" were placed on an input surface (z = 0), at positions (8,8,0) and (16,16,0) for a rectanguloid of 24*24*18 3D CA cells, with a fixed output point at position (11,12,9). If the output point was not an axon, fitness was defined to be zero.

Two experiments were run on the same module (the same CoDi-1Bit circuit, with signal flushing between experiments). In both experiments, the IN input fired at every clock tick. In the first experiment, the SWITCH input was off for every clock tick. In the second experiment the SWITCH input fired for every clock tick. The module was evolved to give a very active output (lots of 1s) if the SWITCH was off, but a low output (few 1s) if the SWITCH was on. That is, the SWITCH acted as an inhibitor.

The bitstrings below show the outputs for the two cases, firstly with SWITCH off, then on. Over 90 clock ticks, the first output had 42 more 1s than the second output.

SWITCH off
000000000000000000000000000000000001100011101111111111111111
111111111111111111111111111111


SWITCH on
000000000000000000000000000000000000100000100000100000100000
100000100000100000100000100000

The number of 1s in the two outputs were labeled as Sum1 and Sum 2 respectively. The fitness definition finally settled upon, was :-

(if Sum1 > Sum2)
fitness = 10000*(Sum1 - Sum2) + 0.001*(Sum1 + Sum2)

(if Sum1 < Sum2)
fitness = 100*(Sum2 - Sum1) + 0.001*(Sum1 + Sum2)

The term 0.001*(Sum1 + Sum2) was used to encourage circuits to give nonzero output at the output point. The terms 100*(Sum2 - Sum1) and 10000*(Sum1 - Sum2) encouraged differences in the two outputs, with a strong preference for the first case to give more 1s in the output.

This result was very encouraging because it shows that controllable (switchable) multi-function modules are evolvable with the highly simplified (suitable for implementation in evolvable electronics) CoDi-1Bit model. Such modules will be very useful when the time comes to evolve them for placement in artificial brain architectures.


Pattern Detector Module

With a slight modification of the code used to evolve the above module, a pattern detector module was evolved which was capable of distinguishing between two square wave inputs, of 111000111000... and 11111000001111100000... In this case, no switch input was used. Two experiments were run. In the first, the input was the 6 clocktick cycle square wave input, applied at the fixed input point (8,8,0). In the second experiment, the circuit was regrown with the same chromosome and the 10 clocktick cycle square wave input was applied to the same fixed input point. The fitness definition was the same as above. Results are shown below. Over 90 clockticks, the first output had 48 more 1s than the second output.

Square wave input 111000111000...

Output
000000000000000000000000100110111011111111111111111111111111
111111111111111111111111111111


Square wave input 11111000001111100000...

Output
000000000000000000000000000010001000100010001000100010001000
100010000000000010001000100010

Since the CoDi modules seem capable of evolving such detectors, it may be possible to evolve modules which are capable of detecting a specific phoneme analog input (e.g. the spike train (bit string) which when convoluted with a particular convolution function gives the time dependent analog signal). In a manner similar to the above, one could input the signal in the first experiment, and a random signal in the second, in a multi-test experiment, and evolve the phoneme detector. Maybe one could evolve a set of detectors, one for each phoneme. By using the SIIC and HSA digital/analog conversions, this kind of thing may become quite practical.


Hubel-Wiesel Line Motion Detector Module

The results of the following experiment were significant for the CAM-Brain Project as a whole, we felt. It involved the evolution of a Hubel Wiesel type line motion detector. Hubel and Wiesel won a Nobel prize for discovering that certain neural cells in the visual region of the cat's brain fired strongly when lines of light at particular orientations and speeds were shone onto a screen that the cats were observing. These cells (neurons) were detecting the motion of lines at a particular orientation. The evolution of this "Hubel-Wiesel" module used the same fitness definition and a similar methodology as in the above case. In the first experiment, a square 12*12 neuron input grid was used. At the first clock tick, the top horizontal 12 neurons were made to fire - at the second clock tick, the second horizontal row of 12 neurons was made to fire, etc, for 12 clock ticks, then the cycle was repeated. This input firing pattern simulated the motion of a line of light moving horizontally down the visual field on the retina of a cat. In the second experiment, 12 randomly positioned input neurons were fired at each clock tick. These 12 positions were randomly generated for each clock tick. This second input firing pattern simulated input noise, to be contrasted with the line motion input. Output results are shown below.


Line Motion Input Case

Output
000000000000000000000000100110100110111111111111111111111111
111111111111111111111111111111


Random Input Case

Output
000000000000000000000000000010000000000000000100010101010101
010101010101010101010101010101

There were 35 more 1 bit outputs in the first case than the second. Since the inputs to the second case are positioned randomly, the same neural net module will generate a different fitness value depending on the input. Nevertheless the evolution still improved over time, developing a fairly robust net giving fitness values corresponding to over 30 1bit differences (between the two experiments) in most cases (e.g. the top 5 fitness chromosomes were saved for each generation and not crossed over or mutated. The fitness values (1bit difference count) of these top 5 were 31, 34, 35, 30, 29 after several hundred generations). Thus it was possible to evolve a Hubel Wiesel Line Motion Detector - quite an achievement. Of course, we have no idea how the circuit does what it does. This is the great strength of "evolutionary engineering".Evolved circuits can achieve performance levels beyond what human engineers can achieve with traditional top-down design techniques, i.e. attain superior engineering performance levels, but the price is that one loses scientific understanding, due to the overwhelming structural and dynamical complexity of these CoDi circuits. Thus "evolutionary engineering" can provide superior engineering, but inferior science. It is a trade-off. In practice, once EEs can generate tens of thousands, even millions of modules, only a few die-hard analysts will want to know how an individual module functions. For the most part, noone will care how a particular module amongst millions actually does its thing.


8. Ideas for Interesting Future CoDi Modules to be Evolved

The "Dagwood Approach" for a Multi-Test Module

The CBM hardware automatically performs a fitness measurement on the assumption that the 1Bit signals which leave the evolving module into the fitness measurement circuit are interpreted with the SIIC approach, i.e. the hardware actually implements the SIIC convolution algorithm. We have implemented the CBM having a single very general fitness measurement methodology, to simplify the electronics. Hence evolutionary engineers using the CBM will need to specify the functions of the modules they want to evolve using the SIIC methodology. However, there is a problem with this unified approach. How to give the same circuit several tests, i.e. several sets of different inputs in a single run. For example, imagine one aims to evolve a module which detects a time dependent input pattern P. One inputs the pattern P for Tp clocks. (Note that a CBM module has maximum 188 binary inputs, maximum 3 binary outputs). One wants the module to respond strongly when the pattern in detected, and weakly if any other pattern is presented. Hence the same circuit needs to be tested for several pattern inputs, i.e. P and others. The pattern P is called the positive case, the others are called the negative cases. (It is also possible that there may be several positive cases (Pi)). One does not want a module which responds well to any old pattern. It must discriminate.

How to test all these cases (positive and negative) in a single run? By concatenating them, i.e. sandwiching them ("Dagwooding" them) over time. (Dagwood was a popular US cartoon character in the 1960s who was famous for making multilayer sandwiches). For example, imagine there are 2 positive examples and 4 negative examples to be input to the same circuit. Hence there will be 6 time periods in which the patterns are presented sequentially at the input in one long run. Between each input signal presentation, the signal states in the circuit are cleared out, ready for the next signal input. This resetting of the signal states is part of the CBM fitness measuring approach that we call "multi-test" fitness measurement. The 6 input pattern periods can be represented as "P1, P2, N1, N2, N3, N4". The periods last Pi and Ni clock ticks each. So that the total number of clock ticks for the positive periods is more or less equal to the total of the negative periods, the durations of the Pi can be lengthened. This should increase the evolvability of the positive responses. Otherwise the evolution may favor the negative cases too heavily. The target output patterns one wants for these 6 periods can be represented as "high, high, low, low, low, low".

Clearing of the signal states between individual inputs in multi-test runs is needed because it is almost certain that reverberating loops will be set up in the circuit such that once an initial input is presented, self sustaining loops are generated which ensure non zero output, even when the input signal is switched off. Such self sustaining loops may in fact be very useful, since they can be looked upon as a form of memory, and hence may be used to make CoDi modules capable of learning, i.e. adapting to their experience. The next section will elaborate on this idea.

The CBM evaluates each partial fitness (one for each test in the multi-test case) and then sums the partial fitnesses to get the total fitness for the circuit (the module). The XOR case described in the previous section is a simple example of a multi-test case.


Modules Which Learn

Up until recently, we have always thought that the CAM-Brain Project would produce neural circuits that would be INcapable of learning, i.e. they would not modify themselves based on their run time experience. The rationale was that it would be complicated enough dealing with tens of thousands of non learning modules all interacting with each other, let alone having tens of thousands of learnable modules. Also, we saw no way of having CoDi modules which could learn. Lately however, we have begun to think that learnable CoDi modules might be evolvable. The essence of learning in a system is that some event in the past leaves some trace or memory in the system. In a CoDi module, that could take the form of reverberating internal signaling after an initiating input arrives. In some modules, once the input stops, the resulting 1Bit signals could die away, i.e. be transient. Alternatively, the reverberating signals could persist and hence constitute a form of memory. Thus CoDi modules may be evolvable which generate reverberating signals.

Once one begins to think along these lines, it may be possible to create modules capable of Pavlovian conditioning and similar phenomena, e.g. one could evolve a module which has two possible inputs M(eat) and B(ell). One evolves the circuit and gives it 3 tests (in a multi-test case), namely M alone (followed by a signal reset), then B alone (followed by a signal reset), then M and B together (with NO signal reset) followed by B alone. The respective output for each of the 3 cases should be high, low, high. If one employs the above zero input Z spacing technique, then the dagwooded input stream would be "[M], [B], [(M&B) then B]", and the desired outputs would be "high, low, high". Note that there would be no signal reset between the M&B input and the B input. This allows for a reverberating internal signaling to be set up by the M&B which allows the B input to give a high output. If this module can be made to function, it shows that classical conditioning is possible with CoDi modules. (Of course certain negative cases would need to be Dagwooded into the multi-test case as well). Thus artificial brains cpable of learning could be designed using learnable, conditionable modules.


From Multi Module Systems to Artificial Brains

Once our group and others have gained a lot of experience in evolving single modules, the next obvious step is to start to design multi-module systems, since the ultimate goal of the CAM-Brain Project is to put many many modules together (up to 32,000 of them in the current design of the CBM) to make artificial brains. Obviously, no CAM-Brain team will try to build a 32000 module brain (with maximum 40 million artificial neurons) all at once. Instead, as a first step, small multi-module systems will be built, with tens of modules. Once experience is gained in how to do this successfully, larger systems will be undertaken, e.g. with 100s of modules, then 1000s, and later 10,000s. What kinds of multi-module systems with about a dozen modules might be interesting to build? Answering such a question will depend on the creativity of individual (human) "evolutionary engineers". Here are some ideas for "10 to 1" systems (i.e. ten to power 1 = tens of modules) - handwritten "capital letter" recognizers (where each module is evolved to respond to a particular input stimulus e.g. 3 converging lines on the middle left hand side of the input region (which would make the letters A, E, F, H, P potential candidates). Similar feature extractor modules could be placed in appropriate input positions. Boolean modules (AND module, OR module) could combine the outputs of the feature extractor modules, to distinguish the letters. A "winner-take-all" macromodule (i.e. a module of modules considered as a functional unit) would be needed to detect the letter module with the largest output signal. How to design/evolve the macromodule? The usual way is to use a form of lateral inhibition. Using CoDi modules to build such a device might not be trivial.

Another "10 to 1" suggestion is a simple artificial nervous system with a small number of sensor modules, and a limited number of motion generating modules, which can be switched on or off by a control signal (i.e. the motion is active while the control signal is active, and becomes inactive when the control signal is inactive). The sensor modules (e.g. the environment is very bright, or dark, or noisy, or quiet, etc) can send their output signals to some decision modules (probably Boolean), whose outputs become the activating signals to the motion generating modules.

Over time, artificial nervous systems can grow in complexity, until they can be called artificial brains. The robot kitten ("Robokoneko" = Japanese for "robot, child, cat") that our team is currently designing will be controlled by an artificial brain with up to 32000 modules. Since this kitten robot contains a CCD TV camera, microphones for ears, touch sensors, 22 motors for the legs and body, etc, it should offer plenty of scope for brain building. Obviously, we will probably begin with "10 to 2" systems to control it, and work our way up to "10 to 4" systems. This is a huge amount of work, which will need to be distributed over many CAM-Brain teams across the planet. With modern (almost cost free) internet telephone technology, coordinating such a large management effort is less expensive. Our team already uses IPhone to talk with our international collaborators on a daily basis.


9. Conclusions

This paper has provided an overview of ATR's Artificial Brain ("CAM-Brain") Project, highlighting the capacities of the new CAM-Brain Machine (CBM) which should be delivered to ATR in January of 1999. This machine will update 150 Billion CA cells a second and evolve a CA based neural net module in about 1 second. This speed should make practical the assemblage of tens of thousands of evolved neural net modules into humanly defined artificial brain archtitectures, and hopefully create a new research field that we call simply "Brain Building". This paper has discussed the neural net model ("CoDi-1Bit") which is implemented by the CBM. Also presented were discussions on how to convert back and forth between analog time dependent signals and spiketrains (bit strings of 0s and 1s), thus enabling users to think entirely in terms of analog input and target output signals. A sample of evolved neural network modules using the CoDi-1Bit model was given. Once the CBM is delivered and sufficient experience with it enables the construction of large neural systems, with tens of thousands of modules, an artificial brain will be designed and built to control the behavior of a robot kitten called "Robokoneko" (Japanese for "robot, child, cat"). The challenges which remain in ATR's CAM-Brain Project are to fully test the limits of the evolvability of the CoDi-1Bit modules (using the CBM), so as to gain experience in what can be readily evolved and what cannot, and then to assemble large numbers of them to make Robokoneko's brain. The biggest challenge will probably be creating the brain's architecture.

The CBM should be fast enough for many multi-module tests to be undertaken. Multi-module systems can be evolved, assembled into the RAM, and then tested as a functional unit. Once a system has been built successfully it can be used as a component in a larger system, ad infinitum. The challenges of the CAM-Brain Project are not only conceptual in nature, but managerial as well. A back of the envelope calculation says that if an "evolutionary engineer (EE)" (i.e. someone who evolves a neural net module using a CBM) takes half an hour of human thinking time to dream up the fitness definition of a module, to specify the module's input signal(s), its target output signal, its input and output links with other modules, etc, then 4 EEs would be needed to complete the design of a 32000 module artificial brain within 2 years. Thus one needs to speak of brain builder teams. If one wants to be a lot more ambitious and build a million module artificial brain in 2 years, then 120 EEs are needed. Such a large team would need managers to control them. One can imagine higher level "brain architects" (BAs) handing out module specifications to lower level EEs who actually evolve them on their CBMs and report back to the BAs with the results. The BAs and EEs need not be located in one place. Modern internet telephone technologies make globally distributed "virtual teams" practical. Our ATR CAM-Brain team already uses the "IPhone" on a daily basis to collaborate with international colleagues.

If artificial brains can be made to work reasonably successfully, e.g. by making interesting robot pets, or simple household cleaner robots, etc, then a new artificial brain based computer industry will be created, which in twenty years may be worth a trillion euros a year world wide.


References

1 Hugo de Garis. "An artificial brain : ATR's cam-brain project aims to build/evolve an artificial brain with a million neural net modules inside a trillion cell cellular automata machine". New Generation Computing Journal, 12(2), pp 215-221, July 1994.

2 Felix Gers, Hugo de Garis, and Michael Korkin. "Codi-1 Bit: A simplified cellular automata based neuron model". In Proceedings of AE97, Artificial Evolution Conference, October 1997.

3 Michael Korkin, Hugo de Garis, Felix Gers, and Hitoshi Hemmi. "CBM (CAM-Brain Machine): A hardware tool which evolves a neural net module in a fraction of a second and runs a million neuron artificial brain in real time". In John R. Koza, Kalyanmoy Deb, Marco Dorigo, David B. Fogel, Max Garzon, Hitoshi Iba, and Rick L. Riolo, editors, Genetic Programming 1997: Proceedings of the Second Annual Conference, July 1997.

4 Michael Korkin, Norberto Eiji Nawa, and Hugo de Garis. "A 'spike interval information coding' representation for atr's CAM-brain machine (cbm)". In Proceedings of the Second International Conference on Evolvable Systems: From Biology to Hardware (ICES'98). Springer-Verlag, September 1998.

5 Fred Rieke, David Warland, Rob de Ruyter van Steveninck, and William Bialek. "Spikes: exploring the neural code". MIT Press/Bradford Books, Cambridge, MA, 1997.