Darwin-L Message Log 45: 1–10 — May 1997

Academic Discussion on the History and Theory of the Historical Sciences

Darwin-L was an international discussion group on the history and theory of the historical sciences, active from 1993–1997. Darwin-L was established to promote the reintegration of a range of fields all of which are concerned with reconstructing the past from evidence in the present, and to encourage communication among scholars, scientists, and researchers in these fields. The group had more than 600 members from 35 countries, and produced a consistently high level of discussion over its several years of operation. Darwin-L was not restricted to evolutionary biology nor to the work of Charles Darwin, but instead addressed the entire range of historical sciences from an explicitly comparative perspective, including evolutionary biology, historical linguistics, textual transmission and stemmatics, historical geology, systematics and phylogeny, archeology, paleontology, cosmology, historical geography, historical anthropology, and related “palaetiological” fields.

This log contains public messages posted to the Darwin-L discussion group during May 1997. It has been lightly edited for format: message numbers have been added for ease of reference, message headers have been trimmed, some irregular lines have been reformatted, and error messages and personal messages accidentally posted to the group as a whole have been deleted. No genuine editorial changes have been made to the content of any of the posts. This log is provided for personal reference and research purposes only, and none of the material contained herein should be published or quoted without the permission of the original poster.

The master copy of this log is maintained in the Darwin-L Archives (rjohara.net/darwin) by Dr. Robert J. O’Hara. The Darwin-L Archives also contain additional information about the Darwin-L discussion group, the complete Today in the Historical Sciences calendar for every month of the year, a collection of recommended readings on the historical sciences, and an account of William Whewell’s concept of “palaetiology.”

DARWIN-L MESSAGE LOG 45: 1-10 -- MAY 1997

A Network Discussion Group on the
History and Theory of the Historical Sciences

Darwin-L@raven.cc.ukans.edu is an international network discussion group on
the history and theory of the historical sciences.  Darwin-L was established
in September 1993 to promote the reintegration of a range of fields all of
which are concerned with reconstructing the past from evidence in the present,
and to encourage communication among academic professionals in these fields.
Darwin-L is not restricted to evolutionary biology nor to the work of Charles
Darwin but instead addresses the entire range of historical sciences from an
interdisciplinary perspective, including evolutionary biology, historical
linguistics, textual transmission and stemmatics, historical geology,
systematics and phylogeny, archeology, paleontology, cosmology, historical
anthropology, historical geography, and related "palaetiological" fields.

This log contains public messages posted to Darwin-L during May 1997.
It has been lightly edited for format: message numbers have been added for ease
of reference, message headers have been trimmed, some irregular lines have been
reformatted, and some administrative messages and personal messages posted to
the group as a whole have been deleted.  No genuine editorial changes have been
made to the content of any of the posts.  This log is provided for personal
reference and research purposes only, and none of the material contained herein
should be published or quoted without the permission of the original poster.
The master copy of this log is maintained on the Darwin-L Web Server at
http://rjohara.uncg.edu.  For instructions on how to retrieve copies of this
and other log files, and for additional information about Darwin-L and the
historical sciences, connect to the Darwin-L Web Server or send the e-mail
message INFO DARWIN-L to listserv@raven.cc.ukans.edu.

Darwin-L is administered by Robert J. O'Hara (darwin@iris.uncg.edu), Center for
Critical Inquiry in the Liberal Arts and Department of Biology, University of
North Carolina at Greensboro, Greensboro, North Carolina 27412 U.S.A., and it
is supported by the Center for Critical Inquiry, University of North Carolina
at Greensboro, and the Department of History and the Academic Computing Center,
University of Kansas.


<45:1>From DARWIN@iris.uncg.edu Thu May  1 01:02:33 1997

Date: Thu, 01 Apr 1997 01:02:28 -0500 (EST)
From: DARWIN@iris.uncg.edu
Subject: List owner's monthly greeting
To: darwin-l@ukanaix.cc.ukans.edu
Organization: University of NC at Greensboro

Greetings to all Darwin-L subscribers.  On the first of every month I send
out a short note on the status of our group, along with a reminder of basic
commands.  For additional information about the group please visit the
Darwin-L Web Server <http://rjohara.uncg.edu>.

Darwin-L is an international discussion group for professionals in the
historical sciences.  The group is not devoted to any particular discipline,
such as evolutionary biology, but rather seeks to promote interdisciplinary
comparisons across the entire range of fields concerned with historical
reconstruction, including evolution, historical linguistics, archeology,
geology, cosmology, historical geography, textual transmission, and history
proper.  Darwin-L is not an amateur chat forum, nor a forum for discussion
of creationism and evolution.  Darwin-L currently has about 700 members from
more than 35 countries.

Because Darwin-L does have a large membership and is sometimes a high-volume
discussion group it is important for all participants to try to keep their
postings as substantive as possible so that we can maintain a favorable
"signal-to-noise" ratio.  Because Darwin-L is not a chat-oriented group,
personal messages should be sent by private e-mail rather than to the group
as a whole.  The list owner does lightly moderate the group in order to
filter out error messages, commercial advertising, and occasional off-topic
postings. Subscribers who feel burdened from time to time by the volume of
their Darwin-L mail may wish to take advantage of the "digest" option
described below.

Because different mail systems work differently, not all subscribers see
the e-mail address of the original sender of each message in the message
header (some people only see "Darwin-L" as the source).  It is therefore
very important to include your name and e-mail address at the end of every
message you post so that everyone can identify you and reply privately if
appropriate.  Remember also that in most cases when you type "reply" in
response to a message from Darwin-L your reply is sent to the group as a
whole, rather than to the original sender.

The following are the most frequently used listserv commands that Darwin-L
members may wish to know.  All of these commands should be sent as regular
e-mail messages to the listserv address (listserv@raven.cc.ukans.edu),
not to the address of the group as a whole (Darwin-L@raven.cc.ukans.edu).
In each case leave the subject line of the message blank and include no
extraneous text, as the command will be read and processed by the listserv
program rather than by a person.  To join the group send the message:


     For example: SUBSCRIBE DARWIN-L John Smith

To cancel your subscription send the message:


If you feel burdened by the volume of mail you receive from Darwin-L you
may instruct the listserv program to deliver mail to you in digest format
(one message per day consisting of the whole day's posts bundled together).
To receive your mail in digest format send the message:


To change your subscription from digest format back to one-at-a-time
delivery send the message:


To temporarily suspend mail delivery (when you go on vacation, for example)
send the message:


To resume regular delivery send either the DIGEST or ACK messages above.

For a comprehensive introduction to Darwin-L with notes on our scope and
on network etiquette, and a summary of all available commands, send the


To post a public message to the group as a whole simply send it as regular
e-mail to the group's address (Darwin-L@raven.cc.ukans.edu).

I thank you all for your continuing interest in Darwin-L and in the
interdisciplinary study of the historical sciences.

Bob O'Hara, Darwin-L list owner

Dr. Robert J. O'Hara (darwin@iris.uncg.edu)  |  Darwin-L Server
Cornelia Strong College, 100 Foust Building  |   http://rjohara.uncg.edu
University of North Carolina at Greensboro   |  Strong College Server
Greensboro, North Carolina 27412 U.S.A.      |   http://strong.uncg.edu


<45:2>From YTL@vms.huji.ac.il Sat May 10 16:23:39 1997

Date: Sun,  11 May 97 0:22 +0300
From: <YTL@vms.huji.ac.il>
To: darwin-l@raven.cc.ukans.edu
Subject: Ontogeny recapitulates...

I would be very interested to learn about the current standing of Ernst
Haeckel's dictum, "Ontogeny recapitulates phylogeny."
Clearly the rule as Haeckel intended it is no longer accepted (or so I think).
However, scientific rules, laws, and theories do take on a life of their own,
and may often still be considered valid or useful in ways that are quite
different from their original formulation.
More specifically, then, I would like to know (1)how the rule is judged
today--true, false, useful, irrelevant, etc., and (2)the basis for this
judgment--a consensus of scientists, particular experimental or observational
evidence, or something else.
I would also be very interested to learn of any applications of this rule to
fields outside of biology.

Many thanks,

Tzvi Langermann


<45:3>From jsl@rockvax.rockefeller.edu Sun May 18 20:05:40 1997

To: darwin-l@raven.cc.ukans.edu
Subject: SocioBiological interpretation of evolution of esthetic sense?
     Non-cognitive components of higher mental functions
Date: Sun, 18 May 1997 21:05:53 EDT
From: Joshua Lederberg <jsl@rockvax.rockefeller.edu>

Whence the passion for artistic creation and appreciation for its
products?  In realms of visual arts and of music?  Or to generalize
a step further, for the Promethean impulse: mastery over Nature?

I see a role for sexual selection: birds' plumage has not escaped
me; but I find it a big leap to equate "pretty" face and form with
fecundity.  Cerebralization of human behavior surely led to many
complications for the maintenance of species-preserving instincts;
so I do not expect simple answers, no facile connection with
individual fitness.

But I would appreciate pointers to the most intelligent discussions
of these issues.

I have remarked that computers would be hindered in the development
of artificial intelligence until they had some fount of passionate
motivation.  (N.B. 'hal' in 2001.)
Small sub-issue: how far do higher primates go in the production and
appreciation of "music"?  I.A., there must be some connection
between the ability to comprehend speech and to respond to music.

Reply-to: (J. Lederberg)lederberg@rockvax.rockefeller.edu

Prof. Joshua Lederberg
The Rockefeller University
1230 York Avenue
New York, NY   10021-6399
212: 327-7809  fax -8651


<45:4>From mayerg@cs.uwp.edu Wed May 21 10:27:59 1997

Date: Wed, 21 May 1997 10:27:51 -0500 (CDT)
From: Gregory Mayer <mayerg@cs.uwp.edu>
Subject: Re: You shall know them
To: darwin-l@raven.cc.ukans.edu

A film was made of this book (or at least the plot was shamelessly
copied) ca. the early 70s , with a title something like "Skullduggery".
The near-humans were supposedly robust australopithecines, but their
makeup and costumes made them look like diminutive orangutans.

Gregory C. Mayer


<45:5>From jsl@rockvax.rockefeller.edu Wed May 21 23:40:31 1997

To: Gregory Mayer <mayerg@cs.uwp.edu>
To: darwin-l@raven.cc.ukans.edu
Subject: Re: You shall know them: Skullduggery
Date: Thu, 22 May 1997 00:40:41 EDT
From: Joshua Lederberg <jsl@rockvax.rockefeller.edu>

A film was made of this book (or at least the plot was shamelessly
copied) ca. the early 70s , with a title something like "Skullduggery".

The film was awful!!  A travesty.  Vercors refused to allow his
name on it.

"You shall know them"
The book is one of my favorites: it ends up with a subtle argument
for social-determinism of the nature of humanity -- but I don't want
to give the plot away.  Is it really out of print?

Let's start a discussion some months downstream of Vercors' main
propositions - but wait till more folks have had a chance to catch

Reply-to: (J. Lederberg)lederberg@rockvax.rockefeller.edu


<45:6>From Eugene.Leitl@lrz.uni-muenchen.de Thu May 22 01:51:35 1997

Date: Thu, 22 May 1997 08:51:21 +0200 (MET DST)
From: Eugene Leitl <Eugene.Leitl@lrz.uni-muenchen.de>
To: darwin-l@raven.cc.ukans.edu
Subject: FYI:Welcome to genetic-programming (long)

I thought this might be of possible interest to readers of this
mailing list. Discussion quality varies, being high on the average.
Caution: not a low-volume ist.


---------- Forwarded message ----------
Date: Sat, 10 May 1997 06:57:02 -0700 (PDT)
From: Majordomo@lists.Stanford.EDU
To: Eugene.Leitl@lrz.uni-muenchen.de
Subject: Welcome to genetic-programming

Welcome to the genetic-programming mailing list!

If you ever want to remove yourself from this mailing list,
you can send mail to "Majordomo@lists.Stanford.EDU" with the following command
in the body of your email message:

    unsubscribe genetic-programming

Here's the general information for the list you've
subscribed to, in case you don't already have it:

 G E N E T I C   P R O G R A M M I N G   F A Q

Table of contents:
1.  What is Genetic Programming [GP]?
2.  Why should I be interested in Genetic Programming?
4.  Where can I find out more about Genetic Progamming?
5.  I'm new to Genetic Algorithms.  Can you point me to
    some sources for more general information on GAs to put
    this in perspective?
6.  I'm new on this mailing list.  Can I see an archive of the old messages?
7.  The GP implementation in the book is presented entirely in Lisp.
    Do I need to use Lisp to have an implementation of GP?
8.  I would like to use GP with C instead of LISP because C is
    usually faster.  How does one use GP with C instead of LISP?
    In LISP the parse tree is obvious - in C it is not.  How does
    one eval C programs?
9.  FTP:  I don't want to have to type in the code in the appendices to
    the book.  Can I get the source code on-line somewhere?
    Is there a place from which I can FTP GP related papers and code?
    Can I get archives of the messages to this mailing list?
10. What's all this I hear about patents?
11. I'm not a sophisticated Lisp user and don't understand why
    so much apparent effort is spent in the Lisp code in the book
    worrying about FAST-EVAL.  Can't I just get things to go faster
    by compiling my population or some such.  This would avoid the
    apparent recursion in the FAST-EVAL function.
12. What specific areas are GPers working on right now?
   12.1 JK and JR
   12.2 Simon Handley
   12.3 Graham Spencer
   12.4 Craig Reynolds
   12.5 Howard Oakley
   12.6 William Archibald
   12.7 Walter Tackett
   12.8 Kim Kinnear
   12.9 Peter Angeline
   12.10 Sid Maxwell
   12.11 Ian Turton
   12.12 Joe Berkovitz
   12.13 Adam P.Fraser
   12.14 Manuel.a.Castro
13. Can I get the code that JK and JR run?
14. Has GP been shown to be better than Random Search?
   14.1 Peter Angeline's results.
15. What is Symbolic Regression?
16. I want to GP but I don't know what sort of machine to
    run it on.  What should I get?
17. Are there any local groups of Genetic Programmers
18. Demes:  What are they and why should I use them?

99. Glossary
100. Book order form.

1.  What is Genetic Programming [GP]?

To answer this question properly, we should probably first
consider the question "what's a GENETIC ALGORITHM?"

The Genetic Algorithm is a model of machine learning which
derives its behaviour from a metaphor of the processes of
evolution in nature.  This is done by the creation within
a machine of a population of individals represented by
chromosomes, in essence a set of character strings that
are analogous to the base-4 chromosomes that we see in our
own DNA.  The individals in the population then go through
a process of evolution.

We should note that evolution (in nature or anywhere
else) is not a purposive or directed process.  That is,
there is no evidence to support the assertion that the
goal of evolution is to produce Mankind.  Indeed, the
processes of nature seem to boil down to different
individuals competing for resources in the environment.
Some are better than others.  Those that are better are
more likely to survive and propagate their genetic

In nature, we see that the encoding for our genetic
information (genome) is done in a way that admits sexual
reproduction.  Asexual reproduction (such as by budding)
typically results in offspring that are genetically
identical to the parent.  Sexual reproduction allows the
creation of genetically radically different offspring
that are still of the same general flavour (species).

At the molecular level what occurs (wild
oversimplification alert) is that a pair of chromosomes
bump into one another, exchange chunks of genetic
information and drift apart.  This is the
"RECOMBINATION" operation, which GA/GPers generally
refer to as "CROSSOVER" because of the way that genetic
material crosses over from one chromosome to another.

The crossover operation happens in an environment where
the selection of who gets to mate is a function of the
"FITNESS" of the individual, i.e. how good the
individual is at competing in its environment.

Some genetic algorithms use a simple function of the
fitness measure to select individuals
(probabilistically) to undergo genetic operations such
as crossover or REPRODUCTION (the propagation of genetic
material unaltered).  This is fitness-proportionate
selection.  Other implementations use a model in which
certain randomly selected individuals in a subgroup
compete and the fittest is selected.  This is called
tournament selection and is the form of selection we see
in nature when stags rut to vie for the privilege of
mating with a herd of hinds.  The two processes that
most contribute to evolution are crossover and fitness
based selection/reproduction.

As it turns out, there are mathematical proofs that
indicate that the process of fitness proportionate
reproduction is, in fact, near optimal in some senses.

Mutation also plays a role in this process, though it
is not the dominant role that is popularly believed to
be the process of evolution, i.e.  random mutation and
survival of the fittest.  It cannot be stressed too
strongly that the genetic algorithm (as a simulation of
a genetic process) is not a "random search" for a
solution to a problem (highly fit individual).  The
genetic algorithm uses stochastic processes, but the
result is distinctly non-random (better than random).

Genetic algorithms are used for a number of different
application areas.  An example of this would be
multidimensional optimisation problems in which the
character string of the chromosome can be used to encode
the values for the different parameters being optimised.

In practice, therfore, we can implement this genetic
model of computation by having arrays of bits or
characters to represent the chromosomes.  Simple bit
manipulation operations allow the implementation of
crossover, mutation and other operations.  Although a
substantial amount of research has been performed on
variable-length strings and other structures, the
majority of work with genetic algorithms is focussed on
fixed-length character strings.  We should focus on both
this aspect of fixed-lengthness and the need to encode
the representation of the solution being sought as a
character string, since these are crucial aspects that
distinguish Genetic Programming, which does not have a
fixed length representation and there is typically no
encoding of the problem.

When the genetic algorithm is implemented it is usually
done in a manner that involves the following cycle:
Evaluate the fitness of all of the individuals in the
population.  Create a new population by performing
operations such as crossover, fitness-proportionate
reproduction and mutation on the individuals whose
fitness has just been measured.  Discard the old
population and iterate using the new population.

One iteration of this loop is refered to as a
GENERATION.  There is no theoretical reason for this as
an implementation model.  Indeed, we do not see this
punctuated behaviour in populations in nature as a
whole, but it is a convenient implementation model.

The first generation (generation 0) of this process
operates on a population of randomly generated
individuals.  From there on, the genetic operations, in
concert with the fitness measure, operate to improve the

Genetic Programming is the extension of the genetic model
of learning into the space of programs.  That is, the
objects that constitute the population are not
fixed-length character strings that encode possible
solutions to the problem at hand, they are programs that,
when executed, "are" the candidate solutions to the
problem.  These programs are expressed in genetic
programming as parse trees, rather than as lines of code.
Thus, for example, the simple program "a + b * c" would be
represented as:
                    / \
                   a   *
                      / \
                     b   c

or, to be precise, as suitable data structures linked
together to achieve this effect.  Because this is a very
simple thing to do in the programming language Lisp, many
GPers tend to use Lisp.  However, this is simply an
implementation detail.  There are straighforward methods
to implement GP using a non-Lisp programming environment.

The programs in the population are composed of elements
from the FUNCTION SET and the TERMINAL SET, which are
typically fixed sets of symbols selected to be
appropriate to the solution of problems in the domain
of interest.

In GP the crossover operation is implemented by taking
randomly selected subtrees in the individuals (selected
according to fitness) and exchanging them.

2.  Why should I be interested in Genetic Programming?
Genetic programming is a machine learning model which, its adherents
would claim, is the most general and flexible around.  It has already
been applied to a wide variety of problem domains and may well have
real-world utility.

3.  This mailing list is bugging me, how do I get off/ You've
    got my address screwed up/ I'd like to subscribe.
    How do I change the status quo?

Send mail to genetic-programming-REQUEST@cs.stanford.edu.

This mailing list is implemented using the Majordomo
mailing list server.  Messages sent to
genetic-proramming-REQUEST@cs.stanford.edu will be
processed automatically to achieve the desired
subscribe/unsubscribe/change of address.

The following are some of the commands available to you.
These commands should be on a line of their own.  The
subject of the message is ignored.

  subscribe genetic-programming

subscribes you according to the address you mailed the message from.

  subscribe genetic-programming <my-address>

will subscribe you with the specified address.

  unsubscribe genetic-programming


  subscribe genetic-programming <my-address>

should unsubscribe you.


will get you a help message.

genetic-programming@cs.stanford.edu.  DOING SO IS RUDE AND

4.  Where can I find out more about Genetic Progamming?
Genetic Programming is a comparatively new field, so there
isn't a large body of documentation on it.  [aside: is anyone
prepared to keep an up to date, on-line GP bibliography?]
A number of researchers are beginning to publish GP related papers.
At least for a while, the best place to start is likely to be
one of two sources:
  a.  "Genetic Programming:  On the Programming of Computers by Means
       of Natural Selection" - by John R. Koza, MIT Press 1992
  b.  "Genetic Programming - The Movie"  - by John R. Koza and
       James P. Rice, MIT Press 1992.
The movie is probably the cheapest and easiest way to get a quick
introduction into the sorts of things that have already been achieved
by GP.  The book is probably the best way to go about becoming a
GPer.  An order form for the book and movie can be found below.

5.  I'm new to Genetic Algorithms.  Can you point me to
    some sources for more general information on GAs to put
    this in perspective?

  Goldberg, David E.  1989.  "Genetic Algorithms in Search,
  Optimization, and Machine Learning."  Addison-Wesley. -
  Mix of theory and practice

  Davis, Lawrence.  1991.  "Handbook of Genetic Algorithms"
  Van Nostrand Reinhold. - Mostly practical.

  Holland, John H. 1992 "Adaptation in Natural and
  Artificial Systems."  MIT Press - Very theoretical.

  Michalewicz, Z.  1992.  "Genetic Algorithms + Data
  Structures = Evolution Programs."  Springer-Verlag -

  .....  suggestions anyone.... ?

6.   I'm new on this mailing list.  Can I see an archive of the old messages?

See the FAQ on FTP access (8).

7.  The GP implementation in the book is presented entirely in Lisp.
    Do I need to use Lisp to have an implementation of GP?

No, there is nothing about GP that requires Lisp, it's just a very
convenient language to use.  It's also the easiest way to write a
GP implementation that is small enough and simple enough to put into
the appendices of a book.  At present, the most easily available
implementation of GP is that shown in the book, so you would require
a Common Lisp implementation to use it.

However, William Archibald <billa@tdo.sps.mot.com has written
a C version.  This version is functionally similar to JK's
version, but does not use discrete generations.  The code is
public domain.

Walter Tackett and Aviram Carmi have written a shareware C version
called SGPC, a superset of the Rice/Koza code which is available via
the GP archives at utexas, the C User's Group, and through anonymous
ftp to the directory pub/Users/tackett on sfi.santafe.edu.  They
provide documentation, example applications, test data, and nominal

8.  I would like to use GP with C instead of LISP because C is
    usually faster.  How does one use GP with C instead of LISP?
    In LISP the parse tree is obvious - in C it is not.  How does
    one eval C programs?

Although it is by no means clear that C is faster in this
context, GP can be implemented in C/Pascal/...  in a
relatively straighforward manner.  Note that (given suitable
type declarations) code generated by a good Lisp compiler is
likely to be just as fast as that generated by a C compiler
unless you are very careful.  Lisp, however, carries a
larger price in working set and is less readily available.

Note that there are any number of people who would
venerate or refute these statements.  All can agree that
you will be best off working in the environment that you
the user are most comfortable with.  Execution speed and
working set are only partial factors of that environment.

The big problem that you have to tackle in coding a C based
version is that there is no garbage collection.  This means
that for anything beyond trivial applications there is a big
benefit associated with doing all of your storage allocation
statically and doing careful storage management.

As an example, Tackett and Carmi's SGPC (see section 6.0)
is a C program which manipulates raw parse trees that may
be read and written as LISP code for convenience and
"backward compatibility." Memory management is performed
by incremental allocation and deallocation of leaves and
branches as needed.  Locality and fragmentation are
managed by using the C library function mallopt() which
locally allocates small chunks of memory from a large
buffer supplied by the system in user-specified

There are lots of ways that one could Cify GP, but here is
a simple scheme:

 a) Statically create arrays to represent the individual
programs.  There should be one table representing the points
in the tree.  This will require a enough bits of width to be
able to represent (max (cardinality function-set)
(cardinality terminal-set)).  You need a second table two
bits wide of the same length to encode the four distinct
types of things you bump into in the trees: functions,
pseudo-macros, variable terminals and constant terminals.  A
third table is needed with enough bits to encode (reduce #'max
(mapcar #'length (mapcar #'arglist function-set))).  These
tables should be sized for the worst-case size of an
individual.  (500 points is probably enough to get along

 b) If the function set is to be {AND OR NOT} with arg
counts {2 2 1} and the terminal set is to be {D0 D1 D2 D3}
then you need two bits of width in the first table and two
bits for the third (you can strictly get away with 1 because
this function set has no zero arg functions, but it probably
isn't worth it if you're trying to make a general
framework).  Now, you make a mapping table that maps {0->AND,
1->OR, 2->NOT} and another that maps {0->D0, 1->D1, 2->D2,

This all means that you can linearise the parse tree into
the arrays in an efficient manner.  You can trivially write
an interpreter which does something of the form:

(defun interpret-program (program current-index)
 (with-slots (points-of-tree type-of-current-point arg-count-table) program
  (case (aref type-of-current-point current-index)
	(case (aref arg-count-table current-index)
	 (0 .....)
	 (1 (multiple-value-bind (value next-index)
	      ;;; Evaluate single arg for this function.
	      (interpret-program program (+ 1 current-index))
	      (values (funcall (aref function-mapping-table
			        (aref points-of-tree current-index))
   (variable-terminal ......)

i.e. you step along the program recursively evaluating the
arguments to functions when necessary, always returning both
the value of the subexpression and the index to the point
after the subtree just evaluated.

Obviously it's a bit more complex than this but the idea is
pretty simple.  It gets a little harder when you want to
have an arbitrary number of randomly generated constants,
since this can spoil your coding efficiency if you're not

 c) Implement the crossover (and other) genetic operations
so that rather than swapping subtrees it shuffles the values
in the tables up and down and copy the crossed over subtrees
(which are always contiguous in the linearised
representation) from one to the other.

 d) A simple resource architecture and a reference counting
scheme that decides what operations are to be performed to
which individuals in the population before any operations
are performed allows you to deallocate all of the
individuals that are not going to contribute to the next
generation.  Using the reference counting scheme you can
easily make it the case that you do not need to allocate
enough memory to accomodate 2X the size of the population.
You can always deallocate individuals before you are about
to need to get new ones to put the newly created individuals
into for the next generation.

9.  FTP:  I don't want to have to type in the code in the appendices to
    the book.  Can I get the source code on-line somewhere?
    Is there a place from which I can FTP GP related papers and code?
    Can I get archives of the messages to this mailing list?

An FPT site has been made available to GP users for both
source code, documentation and papers.

To browse/get stuff from this site you should do the

unix> ftp ftp.cc.utexas.edu
login> anonymous
password> <your-uid@your-host.your-domain>
ftp> cd pub/genetic-programming
....>random ftp commands
ftp> quit

The GP mailing list FAQ is kept in the above directory.
There are sundry subdirectories of interest: "papers",
"code" and "mail-archives".

Note that if you are GETing anything that is binary, you
should use the "binary" command to make sure that you
are in binary transfer mode.  Binary files will
typically be characterised by ".tar" or ".Z" file types.

To submit something to be kept on this site you should
do the following:

%ftp ftp.cc.utexas.edu
220 <boring ftp banner message...>
Name (ftp.cc.utexas.edu:mccoy): anonymous
331 Guest login ok, send e-mail address as password.
Password: <your email address>
230-<blah blah blah...>
ftp> cd /pub/genetic-programming
ftp> quote site group ftp-gp
ftp> quote site gpass mendel!2
ftp> cd incoming
ftp> put <myfilename...>

The "quote site..." lines will allow you extended
anonymous group access and let you cd into the incoming
directroy and write files to it.  The group name is
"ftp-gp" and the group password is "mendel!2".  For
security reasons you will not be able to read that
directory, and additionally we would like to ask that
you keep the password for this access among just us.
This facility makes things easier for the administrator,
but we don't want our submissions directory to be
overrun by people passing pirated software, etc...

You should then send a note to:


describing what it was that you just deposited.

Failing this, please send your submission by EMail to:


along with a note describing what you're submitting so
that it can be moved into the appropriate directory.
Please make it clear in the message whether your
submission is code, documentation, etc..

Sending mail to genetic-programming@cs.stanford.edu is a
good way to announce the availability of your submission
to the rest of the world.  It's a good idea to wait until
your submission has been moved into the appropriate
source/papers directory before you announce it.

To maximise the chances of other people being able to use
your code or read your papers you are strongly encouraged
to stick to commonly used formats, such as

   .tex, .PS, .tar, .Z, or .HQX

and simple text file types wherever possible.  Note that
not all postscript is printable on all postscript devices.
Apple generated .ps files often need a laserwriter
specific laserprep prelude in order to work properly.  If
you submit code, please try to make sure that you've tried
to compile/load it in a virgin environment without your
own special PATH setting or similar environment
customisation to try to eliminate the possibility of your
code requiring include files that you have not submitted.
This problem is particularly common with TeX macro

Jim McCoy, who maintains the archive will try to test out
all papers that come in and make sure code unpacks, etc.
He will probably put copies of stuff coming in in
.tar.Z for code and packages and .PS for papers (keeping
the .tex or .word available if that is what people submit,
but will also try to make a usable PS copy available.)

10. What's all this I hear about patents?
Process patents have been issued both for the bucket brigade
algorithm in classifier systems (to John Holland) and for GP
(to John Koza).  This FAQ does not attempt to provide legal
advice.  However, use of the Lisp code in the book is freely
licensed for academic use.  Although those wishing to make
commercial use of any process should obviously consult any
patent holders in question, it is pretty clear that it's not in
anyone's best interests to stifle GA/GP research and/or
development.  Commericial licenses much like those used for CAD
software can presumably be obtained for the use of these
processes where necessary.

11. I'm not a sophisticated Lisp user and don't understand why
    so much apparent effort is spent in the Lisp code in the book
    worrying about FAST-EVAL.  Can't I just get things to go faster
    by compiling my population or some such.  This would avoid the
    apparent recursion in the FAST-EVAL function.
To have a real understanding of what's going on in terms
of efficiency and what's best to do you'd need to know a
fair bit about Lisp and its implementation and a fair bit
about the application in question, but the argument
breaks down something like this:

 - When you do GP you have to execute a large number of
randomly created and/or evolved programs, which are
consequently not known at file-compile time.

 - The number of times you execute ANY GIVEN individual
program is (generally) the product of the number of
fitness cases and the number of timesteps over which you
need to simulate.  Thus, for a regression problem for a
simple one dimensional space (page 237) you might have
(say) 50 fitness cases, but no simulation and hence 50
evaluations per individual.  For a robotics problem like
the box-moving robot (p. 381) there might be 4 fitness
cases and 350 timesteps so there are up to 1400
evaluations per individual (note that individuals that
quiesce are aborted earlier.)  The number of effective
evaluations might be increased enormously by having
operators in the function set that cause iteration or by
the doing automatically defined function stuff (chapters
20 and 21), which is not suported in the GP code in the
book, but is not that hard to add.

 - There are two obvious ways to execute a program;
interpreting and compiling and funcalling.  There are
costs associated with each of these.  Interpreting is
slower, but you don't pay the up-front cost of
compilation, which is huge relative to evaluating a
simple expression.

 - There are costs in fitness computation other than
simply executing the program itself.  This is
demonstrated excellently by problems like the
pursuer-evader game (p423) or the broom balancing problem
(p289), where part of the simulation happens after the
evaluation of the individual, since the program delivers
the value of the control variable, which is then used to
perform the state transitions of the simulation.  Since
these computations typically involve complicated uses of
transcendental functions they can often be more expensive
than the execution of the program itself.

Thus, what you're trying to do when you do GP is to
maximise the number of calls to the fitness function per
second.  This might be optimised by compiling the
individuals or by interpreting them, or by spending time
on optimising the fitness function.  You can only really
know for sure by metering the behaviour of the run and
trying each (your intuition is almost sure to be wrong in
our experience).  However, our experience tells us that
interpreting is typically faster for the majority of
problems, particularly those that novice users of the
little GP code in the book are likely to be tackling to
start with.  There are also operational issues associated
with compiling the individuals, but we can neglect that
for now.

So, when we say that fast-eval is faster, we're not
comparing it to compilation, we're comparing it to the
interpreter that comes for free with Lisp (eval).  The
reason it is typically faster than eval is that eval has
to carry all of the baggage associated with being a
complete Common-Lisp interpreter and fast-eval doesn't.
This baggage includes things like lexical closures, local
variables and functions and macros, none of which are
needed by GP.

In GP there are really only four classes of things that
are processed: constants, like 3.5, variables, like X,
functions, like +, and functions that control their own
argument evaluation, which we call pseudo-macros, such as
IFLTE.  All of the complicated implementation-specific
hair in fast-eval is caused by the desire to discriminate
between these classes rapidly at run-time.

All Lisps have highly optimised means to execute things
like consp and symbolp.  The only problem comes in trying
to discriminate between the two classes of "internal
points", i.e. functions and pseudo-macros.  CL doesn't
provide a specific optimised hook for this and all
implementations represent their functions somewhat
differently, so we have to hunt around for the fastest
way we can on any particular implementation.  Note that
some implementations are rather fascist about what you
can legally put in the function cell of a symbol, so this
causes some of the reasons for the non-portability.
Indeed, Lucid tightened up on things between release 4.0
and 4.1 and broke the code in the book.  Maybe we'll be
able to get a patch in if there's a second printing.

As far as avoiding the recursion is concerned, this is
not as trivial as it may sound (not necessarily as
desirable as you might think either).  If your
sexpression is (say)

  (setq my-sexpr '(+ a (* b (- c 42))))

then compiling this to give you

  (setq my-compiled-sexpr (compile nil `(lamda () ,my-sexpr)))

will deliver a function that in most implementations will
have open coded the references to the functions +, * and
-.  Note that you might want to put type delcarations
into these functions, such as:

  (compile `(lambda () (declare (type integer a b c)) ,my-sexpr))

to get optimal code generation.

However, if your sexpression was of the form:

  (progn (move-left) (iflte (turn-right) 42 (move-forward) (move-backwards)))

Then you are presented with two problems.

  a) The bodies of the functions move-left etc. will probably
     not get open coded into the compiled individual.  Functions like
     move-left are likely to be trivially implemented, i.e. move-left
     might consist of nothing more than (decf *x-coord* 1.0) Note that
     what one is really trying to avoid is not recursion pre se, but
     rather function-call overhead, which is highest for calls to non-
     (self/mutually) tail-recursive functions.  Thus, failure to
     open code functions like move-left results in compilation giving
     almost no benefit (you'd like to get ~20X speedup from compilation
     in general).  Open coding can usually be achieved by proclaiming
     the functions in question to be inline, but this does not work
     on every implementation and may be suboptimal.  You might have to
     declare compiler-macros (compiler-optimizers) for the functions
     in question to get the right open coding.

  b) In the example above there is an instance of a pseudo-macro call
     to IFLTE.  If it is the case that you always compile your individuals
     then you can just define these as real macros.  Real macros, however
     are incompatible with fast-eval, so you can't use real macros in
     interpreted mode.  One of the main reasons for using pseudo-macros
     and fast-eval is because macro-expansion is very expensive and
     undesirable at GP run-time, especially when all you're using them
     for is as functions that control their argument evaluation.  If you
     want to have your cake and eat it, then you'll need to define
     compiler macros for each pseudo-macro so that they get open coded
     properly.  In the GP code that JK and I use we have a hairy and
     complex chunk of code that automatically generates pseudo-macros
     for our version of fast-eval as well as all of the necessary
     compiler optimizer stuff in order to get around all of these

As you will have spotted, this is not a trivial issue.  Any given
problem will have its own specific usage/performance profile and
you'll need to take things on a case by case basis.  The code in the
book is intentionally supposed to be as simple as possible and yet
show all of the fundamental principles of GP.  For this reason, there
are no type declarations, even though these would doubtless increase
performance.  Type declarations would have increased the size of the
code by a fair chunk and would have clouded the issue too much, we
feel.  Many compilers have a knob you can tweek that will get it
to advise you of good places to put type declarations that will have
the best effect (try fast-eval first, of course).

12. What specific areas are GPers working on right now?

12.1 JK and JR are working on a better understanding of the automatic
discovery of facilitating functions, and exotic fitness measures.

12.2 Simon Handley is working on the prediction of protein
characteristics from sequence data.

12.3 Graham Spencer is learning how to control 6-legged robots.

12.4 Craig Reynolds says:
    Using a simple model of two dimensional mobile "critters"
    (similar to Braitenburgs "vehicles") I have investigated
    obstacle avoidance behavior and coordinated group motion.
    The critters can get information about their world through
    a primitive sort of vision.  They move through their world
    at a constant rate.  The evolved control program "looks"
    at the world and decides how to best "steer" the critter's

    In one set of experiments I started the critter in the
    middle of a maze-like set of "obstacles" (line segments).
    A collision with an obstacle was fatal to a critter.  The
    goal was to take a specified number of steps (300) without
    a collision.  A controller's fitness was based on the
    number of steps taken before a collision.

    After a few fits and starts (and a lot of helpful email
    advice from John Koza and James Rice) I was able to evolve
    a nimble critter that would zip through the obstacles with
    nary a scratch.  But since it was "trained" using only a
    single obstacle course, it was not a robust, general
    purpose obstacle-avoider.  I tried the same experiment but
    with fitness based on three runs with slight perturbations
    of starting position and orientation.  That seemed to make
    the problem much harder, and I haven't yet evolved a
    successful critter for this problem.  I described this
    work in a draft paper submitted to ALife III called "An
    Evolved, Vision-Based Model of Obstacle Avoidance

    I also did some experiments with "coordinated group
    motion", sort of a precursor to real herding.  In that
    problem the world consisted of a group of critters, some
    obstacles, and a predator.  To survive the critters had to
    avoid collisions with the static obstacles, the other
    moving critters, and the pursuing predator.  This work is
    "in progress" in that the only success seen so far is that
    the "quality of herding" metric for the population of
    controller programs can be seen to increase over time.
    But the resulting behavior is not even close to something
    that could be called coordinated group motion, let alone
    herding.  Nonetheless (not for a moment deterred by the
    lack of crisp results!) I wrote a progress report on the
    experiments so far that was accepted by last month's
    conference on Simulation of Adaptive Behavior.  The paper
    is called: "An Evolved, Vision-Based Behavioral Model of
    Coordinated Group Motion"

12.5 Howard Oakley:
    (a) curve-fitting physiological data (comparing statistics and GP)
    (b) optimising stack and FIR filters for noisy (physiological) data
        (comparing GA and GP)
    (c) I hope to build polished applications to allow the non-programming
        scientist to attempt (a) and (b) as well, over the coming 6
        months or so, using MCL.

12.6 William Archibald says that:
       [...] two of us are working on:
	 . image compression
	 . a package that aids in the development of control systems
	 . control of articulated figures for computer graphics
	 . prediction of protein folding
         [And I'm supposed to be working on VLSI design!]

12.7 Walter Tackett says:
	 I recently completed a study using GP for the generation of
	 classification trees for target/nontarget discrimination.  In
	 doing so, I have added a multiobjective optimization feature
	 to the GP system.  One significant feature of this research
	 relative to the 2-class ``nested spirals'' problem in the GP
	 text is that the data in question is noisy, not completely
	 separable, and no ``true'' known solution surface exists.
	 About 2000 data samples (each a 20-element vector) were used
	 to train the system and 7500 were used in test.  The same
	 database was used to train a modified backprop-trained
	 network as well as a classical statistical ``binary tree''
	 classifier.  GP performed significantly better than either
	 backprop or btree.  In addition, the best tree created by GP
	 required 1/8 the computation of the backprop network.

	 A second, and possibly even more interesting, experiment was
	 performed by getting rid of the 20 human/heuristic/"statistical"
	 features used above and replacing them with seven primitive
	 measures of intensity (basically intensity mean and sigma for
	 multiple resolutions).  Those 20 features thought up by real
	 smart human people provide a sophisticated description
	 of the image that is better than the primitive measures - *NOT*.
	 GP systhesizes features from those primitive measurements
	 and achieves (1) slightly higher performance and (2) another
	 half-order-of-magnitude reduction in complexity of the LISP
	 expression required to perform the classification.

	 The above work is documented in a paper accepted to
	 Machine Learning 93 and others submitted to ICGA 93 and
	 WCNN 93/Portland.  One of these papers is currently available
	 from sfi.santafe.edu in the directory pub/Users/tackett via
	 anon ftp, and may soon be available in the GP archives.

	 Currently, Avi and I are frantically banging out megabytes
	 of data on a study of generalization in GP-based
	 classifiers and it's relation to different breeding
	 policies (i.e., steady state elitism and various
	 configurations of "demes").  We think we have discovered a
	 criteria for judging when such a classifier has achieved
	 generalization.  We will eventually post the abstract to
	 the GP list.

	 In support of our ongoing research Aviram Carmi and I have
	 developed a system called GPC (Genetic Programming in C),
	 which features:
	 1) Multiple interacting populations
	 2) Multiobjective fitness
	 3) Demes
	 4) Steady State Elitism
	 5) Parallel Implementation
	 6) X Interface
	 7) Other  ;-)

	 There is a simpler (read: stable, debugged) version called
	 SGPC which is currently available via anonymous ftp to
	 pub/Users/tackett on sfi.santafe.edu, through the GP
	 archives on UTexas, or (for those without ftp) on diskette
	 for a nominal distribution charge ($3 American, I think)
	 from the C Users Group.  You can reach them at:

                           The C Users' Group
                        1601 W. 23rd St. Suite 200
                            Lawrence, KS 66046
                              (913) 841-1631

	 In addition, we have written GPX/Abstractica, a Karl Sims
	 workalike program: this code creates abstract images using
	 your aesthetic taste as a fitness function.  It is
	 available as an executable for the Sun4 via anon ftp from
	 sfi.santafe.edu pub/Users/tackett or from the GP Archives.

	 GPX/Abstractica, a Karl Sims workalike program.  If you don't know
	 what this is check it out!  It is available as an executable for the
	 Sun4 via anon ftp from sfi.santafe.edu pub/Users/tackett or from
	 the GP Archives.

	 My dissertation research (USC) focuses on GP as well: I am
	 tackling an advanced application where fitness is
	 determined via a large finite element/discrete time
	 simulation.  This work deals with some interesting
	 problems of parallelism, coevolution, and emergence
	 heretofore unaddressed.

	 In addition, Hughes Aircraft is funding us to explore GP
	 for (1) A plan generator for autonomous vehicles and (2)
	 some basic extensions to the GP learning algorithm.  I
	 have some very definite ideas about applications of GP to
	 various aspects of imaging as well.

12.8 Kim Kinnear (kim.kinnear@sun.com) says:
	 I have been using GP to successfully evolve iterative and
	 fully general sorting algorithms.  I am exploring various
	 function and terminal sets and their relative difficulties,
	 and using sorting as a laboratory to develop and test
	 modifications to "standard" G.P such as Craig Reynolds
	 Steady State GP (which really helped) and Peter Angeline's
	 subroutines (which are neat, but haven't helped yet).  My
	 long term interest is in appling GP to problems of
	 significantly greater complexity and practical utility than
	 sorting.  I am actively pursuing modifications and
	 enhancements to GP that will improve its performance on such
	 problems, as well as searching for practical problems that
	 need solving and yet are within the capabilities of GP.

12.9 Peter Angeline, Ohio State University (PHD Candidate)
     pja@cis-ohio-state.edu says:

	 I am currently working on several projects related
	 specifically to genetic programming and what I call
	 evolutionary algorithms as a whole.

	 1) Co-Evolving High-Level Representations - This project
	 concentrates on acquiring subroutines from developing
	 programs while they are being evolved.  The Genetic Library
	 Builder (GLiB) uses a novel form of mutation to compress
	 subtrees of a developing genetic program into a subroutine
	 which is added to the genetic library for future use.  Once
	 a subroutine is created its name, generated as a random
	 variable, becomes a new, high-level atom of the
	 representation language for the programs.  Beneficial
	 subroutines are automatically preserved by the normal
	 evolutionary process.  This is similar to Prof.  Koza's
	 Automatic Function Definition discussed in chapters 20 and
	 21 of THE BOOK, albeit a bit more flexible.  Papers are
	 available by request.

	 2) Self-Competing Genetic Programs - In this project I am
	 asking the question "What kind of teacher is best for
	 evolving genetic programs?" I am especially interested in
	 evolving programs to play games, such as Tic Tac Toe and
	 hopefully more complicated games later.  One interesting
	 method of training is to allow the members of the population
	 to compete against each other to determine their fitness
	 value rather than against an "expert" programmed by the
	 researcher.  So, fitness is computed by having members of
	 the population play each other in a single elimination
	 tournament fashion with the fitness of a population member
	 being the number of games won in the tournament.  Notice
	 this provides only a coarse ranking of the population and
	 can be a very noisy source of feedback.  I have run
	 experiments that demonstrate self-competive populations
	 create more robust genetic programs for Tic Tac Toe than
	 ones evolved using a random expert, a near expert and a
	 perfect expert.

	 3) A Taxonomy of Evolutionary Algorithms - I am also
	 interested in understanding what makes all evolutionary
	 algorithms work and how to apply them to problems.  This has
	 lead to the development of a taxonomy of the major
	 components of all evolutionary algorithms, including genetic
	 programming, and an initial attempt at how the pieces work
	 together to make the whole.

12.10 Sid Maxwell <smaxwell@borland.com> says:

	 My original interest beyond a general fascination with
	 the techniques was/is using GP to help evolve a program
	 which recognizes features on a Go board and suggests
	 moves.  Since the GP result is a program, there's some
	 chance that I will be able to interpret the result and
	 understand what the program thinks is important, and why.

	 From a more theoretical point though, I've been nagged
	 by the questions of "how big does a population have to be
	 to get a reasonable chance for a solution to develop" and
	 "what about using a rich set of operations and/or types".
	 It seems to me that the success of reaching a solution is
	 somehow based on the assurance that the necessary
	 operators and operands be represented in enough numbers
	 in the population to make it to the 'final' generation,
	 i.e.  that the population doesn't lose some necessary
	 component as less fit individuals are purged to make room
	 for new individuals.

	 The comment made at the Bay Area GP meeting about
	 monitoring CSEs suggested to me that I could monitor the
	 relative frequency of each internal and leaf node as a
	 population evolves to look for a correlation with
	 success.  What I'm hoping is that it'll be possible to
	 use a rich operator set with relatively smaller
	 populations (based on the number of operators), insuring
	 operator variety and representation by adding an operator
	 (similar to Koza's mutation) whenever that operator's in
	 a population goes below some pre-determined threshold.

12.11 Ian Turton <geo6it@geog.leeds.ac.uk>
      - School of Geography, Leeds University says:

	 I'm working on GP in fortran, since its the lowest common
	 denominator between myself and my boss :-) We are
	 attempting to use GP to create a better spatial
	 interaction model.  This involves taking a matrix of
	 origin and destination trips and a cost function for
	 trips between each zone and predicting the number of
	 trips between zones.  The original work on this was done
	 by Stan Openshaw (my boss) using a genetic algorithm to
	 breed equations (Openshaw, Geographical Analysis
	 v20,N1,31-46, 1988).  Though his main problem was making
	 sure that equations produced by this method where
	 syntactically correct, with GP we use a lisp type syntax
	 and make sure cross over only occurs in the right places.
	 Evaluation is done on the whole matrix at once, which is
	 quite important with 5229 cases, this also allows us to
	 add row and column summation operators.

12.12 Joe Berkovitz <JoeB@edc.org>, Education Development
      Center, Inc.  says:

	 Currently I'm working with co-evolution of pattern
	 generators and recognizers in two separate but
	 interacting populations.  There is no absolute fitness
	 function, merely a set of example patterns; the goal is
	 to generate new patterns that are "like" the examples in
	 some to-be-induced fashion.

	 Recognizers are selected for reproduction/crossover on
	 the basis of their ability to distinguish example
	 patterns from the patterns generated by the individuals
	 in the population of generators.  Generators, on the
	 other hand, are selected fir their ability to generate
	 patterns that the most discriminating recognizers think
	 are similar to the example patterns.

12.13 Adam P.Fraser <a.fraser@eee.salford.ac.uk>,
      University Of Salford, England says:

	 Currently I use GP to study emergent behaviours.  We have
	 a group here that is dealing with multpile robots and I
	 am hoping that GP can be used to produce an emergent

	 The fitness function is calculated from a colony of
	 robots {automata/insects/cells} who do some task which
	 would be improved by them co-operating.  Arguments abound
	 about whether such co-operation is 'real' but as the
	 observer can see the automata working together I believe
	 it must be.  I am currently limiting myself to try and
	 use no communication and no memory.

	 {As a hobby I am working on Virtual Ant modules which use
	 GA techniques to evolve interesting behaviours.}

	 This work is very much at the embryonic stage.

12.14 Manuel.a.Castro <man@taf.fundesco.es>

	 I'm doing a work in GA+GP for my degree in
	 Ingenieria Industrial -I think the American
	 equivalent is Civil Engineering- majoring in
	 Automatica -that's your Process Control.

	 My goal is to ease the procces of control of
	 complex systems using `learning controllers'.
	 I pretend to use a 'mutated' Classifier System
	 with a simulation of a process -maybe cement
	 production.  The mutation in the CS is simple:
	 my rules have the form:

		 condition program -> action

	 The CS's internal state is a set of variables,
	 accessible by the programs.  It seems to me
	 that with this scheme a hierarchy of
	 'subrutines' will emerge, not a single program
	 but a bigger 'program' made up of this
	 subrutines.  When the controller achieves a
	 certain performance, you use it with the real
	 system (maybe with the genetic evolution turned
	 off).  This last step is very uncertain in my

	 I'll begin this week the programming phase, I
	 think the first prototype will see the light
	 (the cathodic rays) in a month.  There're still
	 many books and papers I must see -mainly in the
	 CS part- and select an appropriate credit
	 algorithm but I'm eager to run something!

.....   Anyone want to advertise research goals?......

13.  Can I get the code that JK and JR run?

No.  It only works on Explorers, is undocumented and would
be a total nightmare for us to have to support outside
users.  Our suggestion: take the little implementation in
the book and enhance it (see Appendix D for suggestions).
That way you'll understand you code.

14. Has GP been shown to be better than Random Search?

For a number of methodological reasons, a tightly defined
answer to this question is too complex to include here.
An expansive coverage is given in chapter 9 of the book.

Having said this, the high order bit of the answer to this
question is an absolute and emphatic "YES".  As examples
of this superior to random behaviour are shown in the book
and are well documented in the literature.  For example:

13.1 Pete Angeline evolved solutions to the tic-tac-toe
problem with only 200,000 individuals of effort
(population size of 1000 run for 200 generations).  In
trying 200,000 random individuals he observed nothing that
even half way played the game.  The fitness of the best
randomly generated program was 2.25 compared to a score of
16.5 for the best evolved program.  See his paper for
details on the scoring function.  This result is still
unpublished and will appear in his dissertation.

15. What is Symbolic Regression?

Symbolic regression (chapter 10 in the book) is the
process of discovering both the functional form of a
target function and all of its necessary coefficients, or
at least an approximation to these.  This is distinct from
other forms of regression such as polynomial regression in
which you are merely trying to find the coefficients of a
polynomial of a prespecified order.  In some senses, all
GP problems are symbolic regression problems, but the term
is usually used to apply to finding real-valued functions.

16. I want to GP but I don't know what sort of machine to
    run it on.  What should I get?

This is very much a "how long is a piece of string"
question.  However, GP is a very resource intensive
process and it is probably a good idea to get the fastest
machine that you can (no surprise).

Howard Oakley <Howard@quercus.demon.co.uk> has done a set
of benchmarks comparing different machines running the GP
Lisp code in the book doing a simple problem.  Although
performance will vary over problem domain and according to
whether you choose to run in Lisp or in some other
language, these figures give a reasonable idea of the
relative performance of the machines.  Which machine you
choose to use will clearly depend on local pricing for
the platforms in question.

Machine         Lisp           Test 1    Test 2
                          (times in seconds, gross)

Mac IIci        MCL	2.0        90.59     303.58
Mac IIfx        MCL	2.0        81.18     214.96
Mac IIci+Rocket MCL	2.0        34.17     100.77
Mac Quadra 950  MCL	2.0        28.91     100.22

Sun 10/20	Sparc Lucid	4.1      13.06       ???
Sun 10/30	Sparc Lucid	4.1      11.10      29.81
Sun 10/30	Sparc Franz Allegro   6.25      32.48
Sun 4/470	Sparc Franz Allegro  17.19      68.53
Sun IPC         Lucid	4.1      29.21       ???
Sun IPC         Franz Allegro  14.85      80.86

TI Explorer II  TICL	6.1       13.28     143.64
SGI IndigoR3000 AKCL	6.15     259.09       ???
SGI IndigoR3000 Franz Allegro  16.25      80.56

(Franz Allegro CL was V 4.1)
(note that all results are now consistent and correct, following the Lucid
(times above are gross, including all OS and GC overhead)

More details on systems:
Machine         CPU	  Clock speed   OS      RAM virtual RAM actual RAM used
(if fixed)

Mac IIci        68030     25      7.0.1      20     12             22
Mac IIfx        68030     40      7.0.1      36      0             30
Mac IIci+Rocket	68040     33      7.0.1      20      0             16
Mac Quadra 950  68040     33      7.0.1      36      0             32
Sun 10/20       Sparc             4.1.3      32    180
Sun 10/30       Sparc             4.1.3      32    180
Sun 4/470       Sparc     33      4.1.1      32
Sun IPC                           4.1.3      32    400
TI Explorer II                    6.1        16    128
SGI IndigoR3000	MIPS 3000	33      IRIX 4.0.5	48    120

The 'league table' of speed based on the second test is thus:
1      Sun 10/30 & Lucid           1.0 (relative to fastest)
2      Sun 10/30 & Franz Allegro   1.1
3      Sun 4/470 & Franz Allegro   2.3
4      SGI Indigo & Franz Allegro  2.7
5      Sun IPC & Franz Allegro     2.7
6      Macintosh Q950 & MCL        3.4
7      Macintosh IIci+Rocket & MCL 3.4
8      TI Explorer II              4.8
9      Macintosh IIfx & MCL        7.2
10     Macintosh IIci & MCL       10.2

Please send any further results to Howard Oakley

17.   Are there any local groups of Genetic Programmers?

Yes, there is one in the San Fransico bay area and one in
New England.  To get into the SF bay area one, send mail
to BA-GP-Request@CS.Stanford.Edu.  The Bay Area group
seems to be likely to have monthly meetings.

The New England GP group will hold a Boston area meeting
in September.  Interested parties should contact Andrew
Singleton at <p00396@psilink.com>.

The first meeting of the New England GP group was held on
5/23/93.  Participants from Massachussets, New Hampshire
and Maine had an excellent discussion covering theoretical
and practical and even commercial issues in which there
was "much crossover and little convergence." Residents of
Connecticut and New York are also on the list and standing
by.  Get on the list to stay informed and meet the elite.

18.  Demes:  What are they and why should I use them?

[This Entry kindly furnished by Ron Goldthwaite]

To date, GP has used single populations simulated on
serial processors.  As sketched by Koza (Ch.22), the move
to parallel architectures can involve distributing tasks
(fitness cases) or entire runs.  Koza recommended the
latter since the rate of improvement is greatest in early
generations of any population.

Without migration, distributed runs reduce each
population's size.  But as observed and theorized in
population genetics (PG), and observed with GA, very
modest levels of migration can effectively make
subpopulations equivalent to a single larger population
(effective 'panmixia'), and even slight migration alters
dynamics significantly.  The general migration levels
estimated from PG (notably Sewall Wright's work
(~1930-1987)) are: two or more migrating individuals per
deme per generation give effective panmixia; less than one
individual every 4th generation keeps effective isolation;
and interesting dynamics arise for in-between levels.
(Oversimplification alert: these estimates apply literally
only to non-interacting genes, but epistasis - change in
fitness due to the context of other genes at other loci -

Demes -neighborhoods- may be established several ways.
Island models assume subpopulations with migration to any
other island without distance constraints; much PG theory
assumes this layout.  Stepping-stone models (imagine a
Japanese garden) restrict migration to adjacent islands.
There is little guidance available for choosing the number
of islands; minimally 5-10 are justified by statistic's
Central Limit theorem.  Isolation-by-distance doesn't
involve islands as such, but distributes individuals
spatially and limits mating to close neighbors by a steep
migration probability function.  PG results here are
qualitatively similar to those for island models.

GA & PG dynamics: demes can maintain diversity in the face
of GA's well-known tendency to converge, often
prematurely.  Like immunologists, we too seek GODs
(Generators Of Diversity) in the face of fixation
(convergence) due to selection and just simple drift in
small populations.  GA populations fixate quickly because
similar high-fitness strings are favored in reproduction,
and crossover between these similar strings discourages
diversity - in effect, there's substantial inbreeding,
which PG predicts strongly fosters fixation.  GA diversity
can be maintained by exploiting the divergence of several
demes, so that while within-deme variance decreases,
between-deme variance increases.  Note that GP behaves
very differently here: the crossover operator applied to
two similar, even identical, s-expressions can generate
wildly divergent offspring (Koza p.103-4; and example
p.616), since the crossover points generally differ in the
two trees, not preserving lengths or subtree locations.

Wright's 'shifting balance' theory was developed in PG in
the '30's and incorporated since by several parallel GA
designs (see 3rd and 4th ICGA Proceedings).  Wright's
models are sophisticated, but analytic and generally
limited to simpler analyses (e.g.  non-interacting genes)
Collins used a 'massively parallel' model (on a Connection
Machine) to simulate isolation by distance in a GA (and
several other PG phenomena) (1991 UCLA PhD dissertation,
4th ICGA and subsequent pubs, many available by ftp from
ftp.cognet.ucla.edu /pub/alife/papers) .  His total
population size is 8k-512k.  Compared to panmixia, the
structured populations found solutions up to 25 times
faster.  When limited to 1k generations, structured pops
always found at least one of the two optimal solutions to
his very difficult graph partitioning problem, while a
panmictic population often failed altogether.  An
additional dynamic of spontaneous spatial polymorphism
emerged from structuring: although selection was uniform,
the terrain became a patchwork of alternate genotypes for
the two optima, separated by distinct suboptimal zones of
much greater hybrid heterogeneity (hybrid zones are well
known in PG).  Similar processes in GP are unexplored.

(An alternative to Wright's view in PG is R.A.Fisher's
(also 1930's, and recently in J.  Gillespie's 1991 book),
in which the adaptive surface is too complex and too
dynamic for even demes to adapt - no stable peaks exist.)

Readings: William Provine's _Sewall Wright & Evolutionary
Biology_ (1986) gives an historical view;

John Maynard-Smith's _Evolutionary Genetics_ (1989)
presents population genetic models in a manner useful for
computer scientists.

99.   Glossary

Genetic Algorithm (GA) - Model of machine learning that
uses a genetic/evolutionary metaphor.  Implementations
typically use fixed-length character strings to represent
their genetic information.

Genetic Programming (GP)- Genetic Algorithms applied to
programs.  Genetic Programming is more expressive than
fixed-length character string GAs, though GAs are likely
to be more efficient for some classes of problems.

Recombination - (see crossover)

Crossover - The genetic process by which genetic material
is exchanged between individuals in the population.

Reproduction - The genetic operation which causes an
exact copy of the genetic representation of an
individual to be made in the population.

Generation - An iteration of the measurement of fitness
and the creation of a new population by means of
genetic operations.

Function set - the set of operators used in GP, these
functions label the internal (non-leaf) points of the
parse trees that represent the programs in the
population.  An example function set might be {+, -, *}.

Terminal set - the set of terminal (leaf) nodes in the
parse trees representing the programs in the
population.  A terminal might be a variable, such as X,
a constant value, such as 42, or a function taking no
arguments, such as (move-north).

100.  Book order form

---------------------------ORDER FORM----------------------

PHONE: 1-800-356-0343 TOLL-FREE or 617-625-8569
MAIL:  The MIT Press, 55 Hayward Street, Cambridge, MA 02142
FAX:  617-625-9080

Please send
____ copies of the book GENETIC PROGRAMMING: ON THE
John R. Koza (KOZGII) (ISBN 0-262-11170-5) @ $55.00.
____ copies of the one-hour videotape GENETIC PROGRAMMING: THE
MOVIE by John R. Koza and James P. Rice  in VHS NTSC format
(KOZGVV) (ISBN 0-262-61084-1) @$34.95
____ copies of the videotape in PAL format (KOZGPV) (ISBN 0-262-
61087-6) @$44.95
____ copies of the videotape in SECAM format (KOZGSV) (ISBN 0-
262-61088-4) @44.95.

Name __________________________________





Phone Number ___________________________

$ _______ Total
$ _______ Shipping and Handling ($3 per item. Outside U.S. and
Canada, add $6 per item for surface rate or $22 per item for airmail)
$ _______ Canada - Add 7% GST
$ _______ Total due MIT Press

__ Payment attached (check payable to The MIT Press in U.S. funds)
__ Please charge to my VISA or MASTERCARD credit card

Number ________________________________
Credit Card Expires _________________________________
Signature  ________________________________


<45:7>From bayla@pb.seflin.org Wed May 21 23:20:10 1997

Date: Thu, 22 May 1997 00:20:00 -0400 (EDT)
From: Bayla Singer <bayla@pb.seflin.org>
Subject: Re: You shall know them
To: darwin-l@raven.cc.ukans.edu

The question of defining or recognizing the "humanity," sapience, or other
classification of newly discovered species is a common theme in science
fiction.  There is an extensive selection of both novels and short stories
dealing with this issue.

Many of Isaac Asimov's later Robot stories also dealt with the difficulty
of distinguishing between robots and humans on the basis of behavior,
given a set of three relatively simple rules governing robotic action.

--Bayla    bayla@pb.seflin.org


<45:8>From ahouse@hydra.rose.brandeis.edu Thu May 22 06:41:43 1997

Date: Thu, 22 May 97 07:42:10 -0400
To: darwin-l@raven.cc.ukans.edu
From: "Jeremy C. Ahouse" <ahouse@hydra.rose.brandeis.edu>
Subject: Re: Ontogeny recapitulates...

>I would be very interested to learn about the current standing of Ernst
>Haeckel's dictum, "Ontogeny recapitulates phylogeny."
>Tzvi Langermann

	A fine place to start is Gould (1977), Alberch et al. (1979) and
McKinney & McNamara (1991). You may then find  Raff (1996) and Hall (1992)

	- Jeremy

Alberch, P., S. J. Gould, G. F. Oster, D. B. Wake (1979) Size and shape in
ontogeny and phylogeny, Paleobiology 5: 296-317.

Gould, S. J. (1977) Ontogeny and Phylogeny. Harvard Univ. Press: Cambridge.

Hall, B. K. (1992) Evolutionary Developmental Biology. Chapman & Hall: London.

McKinney, M. L. and K. J. McNamara (1991) Heterochrony: the Evolution of
Ontogeny, Plenum Press.

Raff, R. A. (1996) The shape of life: Genes, Development and the evolution
of animal form. Chicago Univ. Press: Chicago.

        Jeremy C. Ahouse
        Biology Department
        Brandeis University
        Waltham, MA 02254-9110
ph:     (617) 736-4954
fax:    (617) 736-2405
email:  ahouse@hydra.rose.brandeis.edu
web:    http://www.rose.brandeis.edu/users/simister/pages/Ahouse


<45:9>From DARWIN@iris.uncg.edu Fri May 23 14:19:20 1997

Date: Fri, 23 May 1997 15:18:28 -0400 (EDT)
From: DARWIN@iris.uncg.edu
Subject: Announcement and CFP: Journal of Memetics (fwd)
To: darwin-l@ukanaix.cc.ukans.edu
Organization: University of NC at Greensboro

--begin forwarded message--------------

From: "Hans-Cees Speel" <HANSS@sepa.tudelft.nl>
Organization:  TU Delft
To: darwin-l@raven.cc.ukans.edu
Date: Thu, 22 May 1997 11:32:02 MET
Subject: apologies and Announcement & CFP: Journal of Memetics

Please forward this message to relevant mailing lists and newsgroups. We
apologize for possible cross-postings.

                    Announcement and Call for Papers


                       JOURNAL OF MEMETICS -
       Evolutionary Models of Information Transmission (JOM-EMIT)


JOM-EMIT is a new scientific journal published exclusively on the Web, at URL:


The journal has no subscription fee.

JOM-EMIT is the first peer-reviewed journal devoted to the development of
the memetic perspective. This perspective looks at culture, communication
and information transmission as evolutionary phenomena, governed by the
mechanisms of variation, replication and selection. The concept of "meme"
was proposed by Richard Dawkins as  'a unit of cultural transmission, or a
unit of imitation'. The vagueness of this initial description may explain
the present diversity of views on what a meme really is, and how the
memetic model can be used. JOM-EMIT wishes to clarify these issues, by
stimulating a constructive dialogue between the active researchers in the

More specifically JOM-EMIT seeks to discuss the following topics related to

- Comparisons of different mechanisms and models of evolutionary processes,
  possibly in distinct fields of inquiry.
- Philosophical or theoretical issues concerning epistemology and evolution
- Boundaries of the evolutionary approach
- Empirical research on memetic processes
- Models of meme generation and meme spread
- Fundamental approaches aimed at structuring the field of memetics as a

Editorial Board:
Gary Boyd, Bruce Edmonds (also publisher), Liane Gabora, Francis Heylighen,
Mark Mills, Hans-Cees Speel (managing editor), and Mario Vaneechoutte.

Advisory Board:
Gary Cziko, Richard Dawkins, Daniel Dennett, David Hull, and Aaron Lynch.

The JOM-EMIT site contains two mailing lists: an announcement list for the
distribution of abstracts of new papers to all subscribers, and a
discussion list for open debate on the submitted papers, and other memetics
related issues. You can subscribe to these lists via the Journal's home


The first issue (May-September) of JOM-EMIT is now online.
It contains four papers and one book review:

"The Six Essentials? Minimal Requirements for the Darwinian
Bootstrapping of Quality" by William H. Calvin

"Cultural R/K Selection" by Agner Fog

"The Origin and Evolution of Culture and Creativity" by Liane Gabora

"Macromemetics and the Re-unification of Philosophy" by Derek Gatherer

Book review:
"Evolutionary Paradigms in the Social Sciences. Special Issue,
International Studies Quarterly 40, (September, 1996)" by Francis A.

New papers will appear continuously when they are accepted for publication.


We invite interested authors to submit papers for the journal, along with
information on the author, his/her address and affiliations, at the
journal's email address:


All submitted papers will be anonymously reviewed. They will only be
accepted for publication if approved by at least two referees.

Controversial papers of high quality can be accompanied by commentaries
after they are published (similar to the commentary structure the "Brain
and Behavioral Sciences" uses) . This means that a number of peers write a
short comment, to which the author can reply in a concluding short paper.

We also wish to invite letters which criticize published papers, and book
reviews relevant to the journals' aims.

Further information for authors can be found at the web address:

Theories come and go, the frog stays [F. Jacob]
Hans-Cees Speel
Managing Editor "Journal of Memetics Evolutionary Models of
Information Transmission"
submit papers to JOM-EMIT@sepa.tudelft.nl

I work at:
|School of Systems Engineering, Policy Analysis and management
|Technical Univ. Delft, Jaffalaan 5 2600 GA Delft PO Box 5015 The Netherlands
E-mail hanss@sepa.tudelft.nl

--end forwarded message----------------


<45:10>From snoe@ivy.tec.in.us Fri May 23 09:35:04 1997

Date: Fri, 23 May 1997 09:40:18 +0500
To: darwin-l@raven.cc.ukans.edu
From: snoe@ivy.tec.in.us (Stephen Noe)
Subject: Re: You shall know them

Some decades ago, as undergraduates, a friend and I took a philosophy
course, and Vercours' book was discussed.  Unfortunately, we read only
excerpts, because the complete book was unavailable.  One issue we brought
up, but could not settle, was whether the plot included sexual intercourse,
or artificial insemination.  We discussed the biological species concept, as
opposed to the then-new numerical phenetics.  As I recall, the professor was
rather impressed that scientists were more than just butterfly collectors or
engineers who built particle accelerators instead of railroad bridges.

Also, I believe Baya Singer ( Sanger ?? I hit the delete a bit too quickly
:^(  ) pointed out the many discussions in SF on this topic.  One of the
best is a short story by the late Robert Heinlein, called "Jerry Was A Man."
We are not passengers on Spaceship Earth, we are crew, and it's about time
we took our duties seriously.
J. Stephen Noe, Human Anatomy and Physiology
Ivy Tech State College
1 West 26th St.
Indianapolis, IN 46206

Darwin-L Message Log 45: 1-10 -- May 1997                                   End

© RJO 1995–2021