Notes on Artificial Life & Cellular Automata

Preface

Below follows a diary/journal that I compiled between 1991 and 1993 that details some experiments I ran with regard to systems of interacting cellular automata subject to evolutionary forces. The goal was to throw into one big melting pot many of the hip and cool terms -- prisoner's dilemma, Lindenmeyer systems, grammars, cellular automata, genetic algorithms, evolution, natural and artificial selection, artificial life -- and see what came out. I believe that I was successful in this -- it's one hell of a mash.

In the process, I learned a number of things. Many of these are old truths. Some might have deserved of a journal article. Even more could make for good hypotheses. For example:
Could networks of cellular automata train faster than neural networks?: Networks of cellular automata can resemble neural nets in many ways, and, in an arguable sense, are supersets of neural nets. However, small, simple automata are inherently digital, and not analog, and just might be more computationally efficient. In particular, such a digital net could overcome many of the engineering difficulties of building a neural net in hardware, such as the requirement for large analog transistors, which are typically hundreds of times larger than their digital transistor counterparts.
What is the best way for training a cellular automata net?: I played with genetic algorithms ... basically by treating the list of state transition rules as "genetic material" and implemented both random mutations, and gene crossings. However, the results were mixed: in one experiment which used natural selection ("survival of the fittest") to select for classes of state transition rules, a very simple "weed" that hogged all of the resources was quickly found. Putting the same system under artificial selection (I killed genes that didn't produce things I liked) proved to converge very slowly (not at all?) to the desired result.

In retrospect, these results are not that surprising. I note that genetic algorithms have come under increasingly negative public scrutiny, and are proving not at all as efficient as other NP-complete optimization algorithms -- even simple hill climbing and simulated annealing.
What are the phase properties of cellular automata nets?: Putting together a cellular automata system give one a lot of variables to play with: the number of internal states, the number of messages that automata can pass to each other, the number of initial state transition rules, the frequency of random mutations, the numbers and types of "rewards" and "poisons" that each automaton can pass to the other. It is intuitively obvious that large systems will be slow to converge, while small systems may not have enough diversity. How does the convergence depend on the setup?

In particular, I saw some hint that there might be a first-order phase transition (using terminology borrowed from thermodynamics) in the convergence rate, as the above-mentioned parameters were varied. I felt like I discovered a new material. Then I day-dreamed for a while, and realized that networks of automata could be built to simulate DLA, percolation, crack propagation, and all of those other hip phenomena from material science. Networks of automata can arguably be used to simulate any set of differential equations treated as difference equations. The problem, of course, is that such a thing is overwhelming in its possibilities. Rather, the question needs to be narrowed: describe the new class of materials that result when the "atoms" are state machines that can remember their own history, and the history of interactions with other "atoms", in a very strongly time-irreversible fashion.
I have learned that I need to understand simpler systems much better, and I need far better data analysis tools. These systems tended to generate megabytes of data, mostly time series, that were hard to interpret and to correlate. The tools I did develop -- autocorrelation, Fourier transforms, the like, seemed to be powerless in the face of this data, and yielded little insight. Basically, my "lab" was poor on instrumentation.

I would very much like to pick up this line of research again ... but my personal situation does not allow this; something that I am trying to remedy. It is clear that networks of cellular automata represent a new class of materials that might be put to use in yet unimagined ways. The most obvious -- as a replacement for neural nets, is enticing. Appropriately applied, cellular nets could also provide insights into both ontogeny and phylogeny. But as with any fundamental research -- the real interest is in the exploration for its own sake.

Source code and data is available on request; but don't expect great stuff. Write me at linas@teleportal.com and/or linas@austin.ibm.com .

Linas Vepstas -- July 1996.
The Diary



                            Experimental Log
                            ----------------
                              Linas Vepstas
                              -------------


June 1991
---------
The notes below detail some observations made on some code written in
June and July of 1991.  This code represents my first computer experiment
with evolution, mutation and selection.  It was inspired by what I've
read about the prisoner's dilemma game in Scientific American some 5-10
years ago, and some recent research results on evolution reported at
SIGGRAPH (in particular, by the Polish fellow who showed how Lindenmeyer
systems describe the growth, budding, flowering the going to seed of 
plants).

(Background material: a Lindenmeyer system is a grimmer with a set of
production rules whereby the production rules are applied to each and
every token of a string simultaneously, giving a new string; and then 
repeat. This is contrast to the traditional Chomsky grammars used in
computer science lexical analyzers (e.g. lex & yacc), where one
token is selected, and the production rules are applied to that token,
giving a new string, and then repeat.  Chomsky grammars and Thue systems
are one and the same thing.

It is a well known result of mathematics that for each Chomsky grimmer,
there exists an equivalent Turing machine.  One can argue that
Lindenmeyer systems are to Chomsky grammars what cellular automata are
to Turing machines.  I believe that comparison could be made rigorous,
although I haven't ever seen such an attempt.)

My original attempt was to somehow generalize these results, although I
had some significant problems in grasping the whole thing.  One of the
problems had to do with the best way representing the grammars/Turing machines. 
 Eventually, I settled on a representation with Turing
machines. Here's what I did do:

I created a two-dimensional toroidal universe (actually, a twisted
toroid) populated by "entities".  Each entity was a Turing machine. Each
machine had an internal state, a location (i.j) in the world map, the
ability to sense it's surrounding (nearest and next-nearest neighbors),
and a set of transition rules.  Based on its internal state and on the
state of the neighbors (if any), it would move in one of eight
directions (horiz, vert, & diagonal). Essentially, they would swim
about. The Turing machine were initialized to random states, and endowed
with a random set of transition rules.  

If, in swimming about, it encountered another entity, it would play a 
prisoner's dilemma game with it.  It did so by "beaconing": it would post
a value (0 or 1), as would the other entity.  Based on its beacon, its
neighbor's beacon and its internal state, it would transition to a new
state (and post a new beacon). This would continue for a pre-agreed number 
of cycles (the pre-agreed length was the minimum of the supported
lengths of each entity).  After the exchange, each entity would either
"cooperate" or "defect" -- i.e. work with or stiff the other entity. 
Each entity had a "net worth"; thus, the net worth of the two entities
would increase and/or decrease as a result of the encounter, with a
payoff matrix being a traditional prisoner's dilemma payoff matrix.

After some number of cycles, a round of "selection" would be had:
entities with low net worths would be exterminated, entities with high
net worth would be reproduced (copied verbatim).  I found that I had to
experiment considerably with the selection algorithm -- some of them
bombed immediately.

Three selection schemes were tried:
1) Percentile scheme:  Entities were sorted into percentiles. Those in
   the top percentile were reproduced, those in the bottom were "killed".
   One problem with this scheme was that the population was not clamped,
   but was free-running.  After a while, either there was a population
   explosion, or everyone died.
2) Hi/Lo scheme: This is an extreme version of the percentile scheme.
   The "richest" entity was found and reproduced, while the poorest was
   killed.  This solved the population problem, by keeping the
   population a constant.  A new problem manifested itself: The
   population quickly converged to a mono-species (essentially, the same
   individual was almost always the richest, and it was reproduced until it
   dominated the population). Boring.
3) Avg/Lo scheme:  Here, the poorest individual was exterminated, but
   the one to reproduce was chosen randomly out of the population.  Thus we
   nibbled away at the losers.

In reproducing, two types of reproduction were attempted: the new
individual was dropped at a random location, and, in the other, the new
individual "budded" next to its "parent".

Yet another feature of this code was the addition of "mutation" -- at
some very low rate, rules, states, the number of states, the beacon
message length, the beacon state, and the distance sensitivity of the
entities were mutated randomly.  I figured this would help introduce
diversity into the culture.

Because of crashes/power failures/reboots, etc, the ability to
save/restore the population is ASCII format was implemented.  The
population was visualized using IrisGL; I could picture all sorts of 
statistics -- net worth, age, beacon state, etc.

I never did implement a way of tracking inheritance or of measuring
genetic diversity, although I had intended to.

Summary/Conclusions:
--------------------
I eventually abandoned this experiment.  It got boring.  In the version
where individuals reproduced at random locations, the population
eventually settled down to a collection of co-moving, non-interacting
entities.  The intermediate stages were quite interesting -- individuals
clustered together into groups, breaking apart and coalescing as other
entities bumped into the groups.

In the other version, where reproducing entities budded, the population
eventually settled down into two or three co-moving blobs.  After some
reflection, I realized how this was so, and realized that I could have
predicted this.  After all, what was it, really, that I was attempting
to accomplish? I don't know. Somehow, in the back of my mind, I was
hoping that the selective pressures on the population would cause the
entities to organize into interesting spatial structures, whatever these 
might be.  In fact, there were no selective pressures whatsoever to cause
this to occur.  

What I did find, though, in retrospective, is interesting if interpreted 
in a sociological light.  Individuals, when born apart, eventually
learned to stay apart.  It was much too dangerous to interact with
strangers, who could screw you over in the prisoners game.  Note,
however, that something approaching half the population was in the form 
of doubletons or triplets, permanently glued together.  In each 
doubleton, the partners presumably were co-oping, raising their mutual 
net worths enough to stay in the game, although I did not probe to 
verify this. The problem with that population was that when one partner 
of the doubleton reproduced, the offspring sprouted in a random location,
and had almost no chance of finding a suitable partner.  Thus,
ultimately, it was culled.  (In fact, it is quite possible that the
population was "anchored" by the doubletons.  All of the singletons I
saw could easily have been the offspring of the doubletons, floating
about until their eventual death.  In fact, a co-oping doubleton would
always be stronger than a non-interacting singleton -- the net worth of
a co-oping doubleton would always increase, while that of the
non-interacting singleton would stay constant (since the lack of
interactions meant no opportunity to increase or decrease their wealth.)
They would eventually find themselves at the bottom of the wealth ladder,
and get exterminated.)

In the experiment where offspring budded, one found that from birth, it
was better to live among friends.  If, due to mutation or random
encounter, a rogue individual showed up, it was probably stiffed in the
prisoners game until it's net worth was exhausted.  The individuals in a
blob were presumably all of one genotypes, and were busy rewarding each
other by co-oping in the game.  Again, I did not probe to find out if
this was really the method by which the blobs survived & grew.  I also
did not check to see if the individuals in a blob were of a single
species, or whether the blobs consisted of two or more symbiotic populations.


6 July 1991
-----------
Even when no evolution going on (since convicts game not yet
implemented), co-moving pairs are likely to come into being.


22 July 1991
------------
Have now implemented the convicts game, selection, mutations, and a file
dump and restore.

---) Of the various methods of selection, the one that seems to work best is
     where the "poorest" individuals are culled, and some other individual is
     reproduced at random.
--) culling the lowest and reproducing the highest didn't work;
    population settled down into a mono-species almost immediately.

--) Am experimenting with two methods of reproduction: A) reproduced
    individual as given random new location. B) reproduced individual is
    budded off of parent.  Selective pressures govern which direction the
    bud goes off in.  A "dormant" flag was added, also under selective
    pressure, to give individuals the ability to go dormant (not reproduce)
    so as to avoid "cancerous" growth.

Notes on files:

Culture 1 is a "bud" culture

1.42000 -- bud rule -- some proto formations
1.84000 -- settling down to two globs, a bit small for the actual number
           of individuals involved -- heavy overlap ?

Culture 2 is a "random location" culture

2.44000 -- some interesting pairs swimming around
2.66000 -- population is noticeably composed of non-moving individuals
2.88000 -- vast majority of population is non-moving (less than 20
           moving) (21 half-lives)
2.110000 -- all but 1 individual appears to be stationary (26
            half-lives)
2.132000 -- clusters of individuals (less than 20 total) swimming
            through ground of stationary singletons, pairs
2.154000 -- similar to above  (37 half-lives)
2.176000 -- ditto (42 half-lives)
2.198000 -- ditto (48 half-lives)
2.220000 -- (53 half-lives) no moving individuals.  Shall we continue,
            and wait for a hopeful mutation?



February, March 1992
--------------------
Ported code to IBM RT, was a real pain because RT doesn't have ANSI C
compiler.  Also, doesn't support IrisGL, did a trashy port of display to
X11 Windows.  Code moved to directory "life/prisoners".  New directory
created, "life/plants", where I hope to get some plant life going.


March 22, 1992
--------------
Above code moved to directory "life/prisoners".  New directory created,
"life/plants/light".  In this directory, first attempt made to evolve
plants.  The rules are: each Turing machine ("entity") must either touch
the ground, or touch another entity that is touching the ground. The
universe is illuminated with light; ray-tracing is used so that the
light can cast shadows. Entities are rewarded by light; after a
threshold, an entity can bud.  In the first experiment, only the tips of
the "plants" could bud; in the later version, entities in the stem could
bud, pushing up the stem.  I don't know why, but I was expecting to
actually see fractal-ish "plants" come into being.  (Incidentally, 
mutations were turned of, as was reproduction by other than budding 
(no "seeds").) Of course, no such thing happened.  The plants grew
predominantly in straight stalks.  There were some interesting
developments with regard to fractal-ish shapes, but these quickly found
themselves in the shade, and stopped growing.  It is impossible to beat
something that grows straight up, as far as gathering light is
concerned.  (The light was provided by three or four "suns", one directly
overhead, and others at an angle.)  (The save/restore of cultures was
not made operational for these guys.)


March 28, 1992
--------------
Hypothesis: I started thinking this way: maybe the plants need to be
rewarded by contact with "air", as well as "light".  This will give at
least the smaller, more complex shapes a fighting chance.  However, it
is not entirely obvious that this will promote fractal-ish shapes. (The
naive idea is that trees, bushes have very large surface area, with
rather small volume; i.e. that they are space filling curves.  Thus, the
"air" idea.  However, I suspect that at least a third element is necessary
 -- "water", which can only be obtained from the "ground" (in
infinite amounts), but must be passed up be each entity to the one
above.  A sufficient amount of light, air and water is necessary for a
given entity to bud.  I suspect that this is what leads to the shapes of
trees ... that a "trunk", main "bus" or "highway" is needed to connect
the ground to the air; the surface maximizing drive comes from the need
to couple to the "air" efficiently.  Further thought leads me to think
that the light is not really necessary -- air & water are sufficient to
bring out the desired behavior.  I will now proceed to test this
hypothesis.

(The original "plant" universe is a 600x600 toroidal universe, with 0
being the ground.  In the new tests, the hight is not really needed, but
more width is.)


March 29, 1992
--------------
Well, I didn't really do the above.  I implemented water.  I ran water
for a while (having turned off light) and got a blobby mess.  I gave up,
and turned light back on. In retrospect, I think that this one is worth 
running again.  Water & light was a lot more interesting.  In the first
ten or twenty cycles, stalks do appear.  I suspected that these "stalks"
were deadwood, so I created a "winter" which comes every 27 cycles. In
the winter, I cull out all of the Turning machine transition rules that
were never used.  I also look at the water-net-worth of the individuals,
and get rid of those which have negative net worth (less than a certain
small negative minimum). (They get a negative net worth because I let
the entities "dry out" a little each cycle -- they loose one unit of
water each cycle). (The transition rules allow an entity to pass as much
as 100 units of water to a neighbor, depending on the internal state and
the direction the neighbor lies in). (Actually, they way I remove the
"dead" individuals is worth remarking -- effectively, they become
"transparent" -- clear to the light, but they do provide structural
support in that further budding by living cells does push on bud-branch 
and move it.  They do not occupy "space" -- see below).

The winter did knock out a lot of the population -- the stalks
disappeared, leaving cells hanging in the air.  Cut off from the water
supply, these eventually die. The cells stayed close to the ground,
looking garbagy.  One problem with this implementation was that I was
using "boson" rules -- multiple cells could occupy the same location in
space.  I didn't like this (the arcane technical reason was that only 
the most recent cell to occupy a square was "visible" to its neighbors
-- its not really "bosonic", but just weird.), so I implemented
"fermionic" rules.  In moving a budding branch, I check to see if a
location was already occupied.  If so, then I search the neighboring
locations for a vacant slot, searching directions close to the bud
direction first.  If one is found, bingo, and I translate the rest of
the branch in the same direction.  If an empty nearest neighbor slot is
not found, the cell, and the remainder of the branch is killed.

I ran this overnight (about 2000 cycles, and 66 "winters"), and a thick 
mat developed.  At first, I felt disappointed again, but after some 
thought, I'm getting pretty excited.  First, some stalks, now, a moldy 
rhizome!  I like it! The trace of this run is in file 29.03.1992

I added some statistics tracking to the thing.  (I am stymied for lack
of decent graphing tools on UNIX). Apparently, one cell species overtook
the entire population (accounting for 3800 cells), with other cell species
having only token representation, accounting for several hundred.
This one cell species seems to have very clearly established itself after
about 100 cycles;  After about the same number of cycles, population
growth seems to have leveled out at about 3000 entities, growing only to
4000 after 2000 more cycles.  (Its hard to tell, without graphing facilities).

Incidentally, my "universe" is 1000 pixels wide -- this
means that the cells are 3-4 deep, per pixel, with the mat being 10-15
pixels thick.  Looking at these numbers, and the mat, it would seem that
the light has been pretty strongly attenuated on the ground. (Hmm --
this could be an interesting statistic to collect...)  It is interesting
to note that the average age of the individuals of the predominant cell
type is 25 cycles -- less than a "season"; and the std. deviation of age
is 35 -- looks like a good 2/3'ds die out every winter (is that
right???)  I'm now generating more stats, to see what this looks like.
(A species is identified through ancestry -- I started out with 100
individuals -- thus, 100 species, as I can trace back to the ancestral
parent.)

(Since I'm not doing controlled experiments, I've also added a
"mutation" -- every winter, every cell gets a new, randomly created 
transition rule.  If the rule got used by the next winter, that rule
becomes a permanent part of the entities genetic compliment.  The
average number of rules per cell does seem to grow slowly (up to 15
after 2000 cycles) --- interestingly, its about the same for cells of
all ancestries, not just for the predominant "species". The number of
active rules seems to be about 17 -- seeing how I've started with 40,
and added 60 more, this states that 5 of 6 were never applicable to a
real situation.  However, this "inapplicability" statistics is probably
highly dependent on the internals... Have I explained the detailed
internals yet?) (Incidentally, the predominant species had 15 rules, a
bit on the low side overall.) The trace of these results is in file
30.03.1992

Details:
Each state transition rule must match the following, to apply to the
transition:
1) location of nearest neighbor (there are 8)
2) cell "type" of such neighbor (in the run described above, there are
   10 different cell types)
3) cell internal state (in the run above, there are 4 internal states).

After a transition, a cells' type and internal state changes.

In the next run, I've increased the number of internal states to 14.  I
note that the rhizome structure appeared much more quickly, although
after 150 cycles, there is still no one predominant species.
More details when the run completes.

I am not yet collecting statistics on the "average length of an orbit"
-- the number of rules that get actively used.  There are 17 rules on
average, but I can't tell how many of these are infrequently used.

"dormancy" code is in place, but currently disabled.


1 April 1992
------------
Hmm. In the internal states=14 experiment, the population eventually
settles down to one genus, which accounts for 93% of the population.
The average age in this population is 25 cycles (shorter than one season
of 27 cycles).  Average water content of each individual is 27.  They
won't live long on that.  Visually, the mat is very thin (syn: narrow, 
low) & dense; about 1/5th the thickness of the case above.  The 
individuals of the other species have been reduced to 1-3 of each.  
The number of applicable rules is down -- the population as a whole 
runs 4-10 rules, with the predominant species clocking in at 10 (very 
definitely high).  The predominant species took much longer to assert 
itself, not emerging as the clear winner until 500 cycles have passed. 
(Again, I wish I had a graph). The trace is in file 31.03.1992

Driving around today, it occured to me that the system is probably in a
chaotic regime; and very far from the "self-organized criticality" that
I was hoping to find.  I am not clear on how to demonstrate this;
however.  But visually, I seemed to see the buds growing so fast that
they were crashing into one another.  This probably set each individual
into a chaotic walk through its set of states rather than having it
cling to a small set of states.  But how to show this ???.

In the next experiment, we limit number of internal states to 4 and the
number of externally displayed types to 4. Thus, a total of 4*4*8 = 128
different possibilities are possible.  Each entity starts out with 256
rules, which are neither created or destroyed.  Given this, the chances
are (127/128) **150 = 31 percent that no rule (except the last, default,
rule) is applicable to the situation. I fear memory usage if I make this
any larger. It occurs to me that in the previous experiments, behavior
was dominated by the default rule ...

Ran out of real memory.  Am trying again with 75 max rules, so
probability is (127/128)**75 = 55 percent that default will apply.  What
I saw visually was a lot more interesting than anything that had come
before. More later ...

Still ran out of memory.  Will try again on an RS/6000.

Preliminary evidence indicates that most rules never get used, and that
others get used very, very often. This would indicate that the system is
very far from a chaotic point.

The trace of these partial results is in file 03.04.1992


4 April 1992
------------
Partly due to the memory usage problems of the experiment above, and
partly due to the realization that earlier experiments may have been
driven by the default rule, I re-implemented the test so that each
individual gets a choice from a large number of rules (500) for each
state transition. To avoid memory over-commitment, the unused rules are
re-initialized and passed on to the next entity for its state
transition.  Doing this slowed execution time way down, but essentially
give each individual 500 rules to chose from.  What is interesting is
that after 180 generations, no individual had accumulated more than 15
rules, with the average appearing to be about 10 (I will implement 
something to track this stat.)  Undoubtedly, this number depends on the
total number of states (4 in this sim), and the total number of types 
(also 4 in this run).  This would appear to be substantial evidence that
the system is not chaotic -- the number of applicable rules does not
grow without bound, but seems to settle down to a surprisingly low
number. (There are 4*4*8 = 128 different configurations that an entity 
can find itself in -- thus, one might have expected 128 rules to be in
play, if the system really was chaotic.  Even given that typically,
a given entity had only a neighbor above and below, one might expect
4*4*2 = 32 rules to be at play.) (I fear this needs more careful
analysis. The indication is that a given entity does not walk through
all of its states, but only through a limited number of them.)
(The trace of this experiment is in file 04.04.1992)

I realize why I never saw branching structures; the way budding worked
was that it was always one dimensional -- the bud always pushed the
stalk up.  I'll have to implement branching buds to get branching.

I am disturbed that I have to implement a new feature to see new
behavior.  My original goal was to have this process be a lot more
automatic ... that somehow everything would happen, without the
necessity of my intervention.  Somehow, my overall model seems to be not
general, generic enough ... but I can't figure out how to make it so
...


5 April 1992
------------
OK. Ran experiment, as on day before, with ntypes=8, nstates=8.  The
culture has been the "best" to date.I know I'm not being driven by the
default rule. Two tall (over 40 pixels high) blades emerged; one simple,
straight, the other with knobs on it.  Both survived essentially intact
for 800 cycles. A carpet of mossy things carpeted the rest of the floor.
Lots of very interesting small scale structure, but nothing other than
the above mentioned two sticks.  After 800 cycles, a good 30-40 genotypes
survived in respectable quantities; no one type seemed apt to take over.
In fact, the system looked to be pretty stagnant.  I'll betcha the stats
(in file 05.04.1992) will confirm this. What else? the predominant
genotypes had 10 to 30 active rules associated with them, nobody
cleared 30, in fact.


6 April 1992
------------
Next experiment underway.  This one fixes the branching problem.  It
also adds a new interesting twist: branches that get detached will fall
to the ground (or at least to the lowest unobstructed point).


17 April 1992
-------------
Fixed a stupid bug in the relocation code. The results look very 
interesting.  Saw a number of complex, promising, plant-like structures
emerge.  Unfortunately, learned something about the 2D world: grasses, 
which can grow tall, can block all light.  Thus, the more interesting 
structures, which don't grow as fast, eventually become light starved 
and stop growing.  The need to move to three dimensions seems inevitable.
Unfortunately, I seems to be staring at two problems: a lot of code 
modification to 3-d'ize the model, and the other is that 12 meg's simply
aint enough to do a good job.  In a previous run, the machine asserted 
with flashing c6-07 (which I think is a lack of page space ??).

The other problem I have is that I need Motif and IrisGL, which are both 
missing on the RT.


27 June 1992
------------
Begin active development of a 3D version. I have a number of changes
(improvements) I want to make.  One is a structural change that will save
lots of space, and is closer to the initial intent I was striving for a
year ago.  Its so obvious I could shoot myself.  Going to have the
abstraction of a "gene": this is the structure off of which all of the
state transition rules will anchored.  The gene itself will have no
position, no neighbors.  All entities (cells) of a plant will share this
one gene and use it to perform their state transitions.  (In the prior
implementation, each entity had its own set of rules.  Very wasteful of
space, and of marginal/detrimental value to the growth/variation.)

This restructuring also makes very clear how to implement gene crossing.
In the past organization, it didn't seem to make much sense to cross
genes.  After all, this would effect only one entity, and by then, it
was already rooted and growing; the change would not benefit the entire
plant.  Now, I'm thinking of of letting the system grow for some number
of cycles, and then measuring the viability of each plant. The viability
is associated with the gene.  The plants are then "killed", and the
genes are crossed and used to start the next generation.  Why haven't I
though of this before ???

I've also upgraded the RT to 16 MB of memory.  This should help.  I've 
also obtained and installed the Motif widget library; Got what I think
is an X11.4 library; the server remains X11.3.  However, I have some
funky servers, but these hang up  in weird ways under weird situations.

On further consideration, I think I will try this gene stuff on the 2D
version first.

Development remarks:
--------------------
The initial state of a new entity will be the first rule of a gene.


28 June 1992
------------
Hmmmmm. I am faced with a bit of a conundrum.  Thinking about
implementing sex for these plants has forced me to address certain issues
that I have been avoiding.  In the current implementation, the
individuals are all propagate by dropping buds.  After a large number of
cycles, some of the plants have grown tall, and have blocked all of the
light from the others.  End of story.  They've won.  How do I introduce
the concept of sex/survival of the fittest into this scenario?   Well, I
introduce the concept of winter, mow all the plants down, and compare
the genes for their relative strength. The weaker genes I cull, the
stronger ones, these I allow to have more instances thereof.  But here
is the magic question: which of these genes are stronger? 

The ones that grew taller? The ones that accumulated more water? The 
ones that have accumulated more light?  The ones that have more buds?
The ones that have had a lot of buds have already done quite well for
themselves.  Why should I further reinforce this behavior?  What
behavior should I reinforce?  Why bother?  I can't figure it out.  It
is much too tempting to end this with the conundrum that biologists have
been facing for a while: Why sex? Why bother? I mean, whats the point?
(Why sex? is not really the issue, but it does bring into question the
whole point of this exercise.

I'm getting bored of this whole thing.  I am thinking of adding a couple
of features that would allow me to be gardener, and call it a day.  For
one feature, I will wait for a mouse click.  The plant on which I've
clicked will die. This way, I can selectively choose for plants that I
find visually interesting.  It'll be a good toy; but I'm not sure what
I've learned from this experience.


4 July 1992
-----------
Hmmm. Some more results to report.  I've been running a plant simulation
in the background, while working on other things.  Every now and then,
like a good gardener, I'd do some weeding.  After three days of running
on the RT, I am finally getting some interesting structures.  By hand,
essentially, I am applying selective pressures, where I am selecting for
interesting looking, branchy structures. Its been a long and tedious
process.  I am about to start on something completely different (I want
to apply these population principles to growing electronic circuitry),
but before I do so, I have one more experiment.  I am going to implement
artificial selection.

Suddenly, the experience of doing some "gardening" has clarified in my
mind the difference between natural and artificial selection.  What I've
implemented with the plants is a system that undergoes "natural"
section.  Natural selection occurs when the punishments and rewards are
integrated inherently into the structure -- e.g. "water net worth" and
"light net worth".  Plants which get light and water grow, and those
which don't, don't. I do not have to apply any additional selective 
pressures -- water and light do it for me, automatically.  

However, what I can do, and will, is to apply artificial selective 
pressures.  Every so often, I am going to look at all of the plants, and
examine them for branching structure.  Those that branch well, those I
will reward by reproducing, and those that fail to attain any height, or
to branch, I will eliminate.  We shall see how this experiment runs --
we'll see what applying pressure to the population will do.  Anyway, to
finish defining artificial section: it is a force coming from outside
of the system, not inherently part of the system.


8 July 1992
-----------
Ran the artificial selection program for 1000 cycles.  With initial
population of 65, and a total of 85 gene crossings to date, only 13 of
the 85 "took" (weren't deemed least fit during the next competition
round (one competition round every 10 cycles).  This is a success rate
of 1/6 -- pretty poor.  Why? 1/2 of all genes will attempt to grow
downwards (bad), and 1/2 will start dormant -- thus, one would expect a
success rate of 1/4.  Obviously convergence will be slow.
A trace of this run can be found in file 07.06.1992.

One interesting characteristic is that the the ground fills up with a
clutter of tangled trash after a short number of cycles.  New plants
have to find their way out of this clutter -- essentially, by
establishing a symbiotic relationship with the clutter on the ground --
growing roots, as it were, since there are darned few plants that
actually have the opportunity to reach the ground.  Interesting. 

Made some changes to the way rules are used which I expect will improve
overall behavior.  In the prior implementation, had to match the
neighbor type AND the neighbor location to get a match (to determine the
state machines next state).  Added ability to "not care" about the 
location of the neighbor, as long as the right type IS a neighbor; 
also added "don't care if there is a neighbor" type rules. This should
improve the viability of the genes all-around.  It might also avoid some
of the matting on the ground.

(ALSO -- in the past several weeks, added ability to monitor statistics
dynamically, added ability to view individual plants & their stats,
first by going into "view" mode by hitting middle mouse button, and then
clicking on the plant with the right mouse button).


12 July 1992
------------
Recompiled on RS/6000, ran for 12 hrs, (6000 clock ticks). About 1/4 of
the 65 initial genes succumbed in the first 35 competitions (one
competition every 10 clock ticks), but then after that, no new gene ever
managed to establish itself. Of the 10-15 that were created as a result
of crossing, none appeared to be particularly strong. What, precisely,
is going on, that the system fails to generate genes that are stronger
than the initial, randomly chosen, complement?  Is there a bug?  (I did
see some peculiar behavior when I used the "kill" button, but this could
be explained in other ways ...)

Late June, I got the idea of using these algorithms to "grow" electronic
circuits and corresponding microcode.  This strikes me as a very
appealing idea -- the creation of electronic designs by a minimizing the
cost in terms of transistors, wire lengths and cycle times.  Could be
lucrative, if it could be made to work.  However, I've been putting this
off, pending some greater successes with the plants.

Back in February 1992, I had been reading a book on Neural Nets, and
came away with a bad taste in my mouth.  Neural nets did not appear to
be particularly tractable things for solving problems.  One thing lead
to another, and I again got to thinking about about the automatons I had
created back in 1991.  I had the idea that I was going to create another
set of automatons that would scan English language texts and come to
"understand" them.  There was one stumbling block that I couldn't quite
overcome.  What would be the cost function?  How would I recognize one
automaton as being better than another? It seemed that I needed a global
cost function that could evaluate the viability of a given automaton,
but I seemed to see no way of generating one automatically.  I could, of
course, manually examine the output of each, and pass judgment on it.
But this would be absurd; I don't have the time to do this, and it would
probably require a millennia if done by a human.  It seemed that there
was no automatic method for recognizing intelligence -- this job
required a human.  A bit discouraged, I figured that if I was so smart,
implementing a toy problem -- something like plant evolution, should be
easy.  After all, I seemed to clearly grasp all of the necessary
ingredients.

(There was one scheme that I thought might work, but Jeff Wilkinson
talked me out of it.  The idea was that if the automaton, after
examining one section of text, could then reproduce another section, it
would be rewarded.  Jeff noted that this is much like training a neural
net to one set of stimuli; it would do little more than "memorize" the
text; when faced with another text, it would likely generate word salad.
Jeff is likely right ... there is no magic, and my approach was probably
too simple-minded.)

Anyway, I've been reading Julian Jaynes' book, "The Origin of
Consciousness in the Breakdown of the Bicameral Mind".  Very interesting
book.  Its gotten me to reexamine an earlier hypothesis I've put forward.

"In the beginning, there was the Word:" in 1988, it occured to me that 
we make a grave error in thinking of ourselves as conscious beings.  The 
claim was that in fact, we, as humans, are little more than petri dishes
supporting a culture, that living culture being "ideas".  Ideas "infest" 
our minds; it is ideas that are the true living "beings".  Ideas share a
property with viruses in that they have a dormant, inactive state: ideas 
can be captured in books, writings.  Ideas can survive for centuries,
millennia, ideas infecting open minds, propagating from generation to
generation of humans, either growing stronger & more detailed, or being
proved irrelevant in a new age, inapplicable, false, outmoded.  Someday,
better thinking machines will be built, and there ideas will flourish.
They'll go out to travel the galaxy, and then the universe; not us
humans, as so naively envisioned in Star Trek.  My mere writing this 
down, then, signals the birth, the first self awareness of an idea as 
a living thing.  

Ideas are very much part of the physical world, even though physics 
seems to have no place for them.  Ideas have "power" and "volition", as
it were: embodied in a human, they build bridges, they build empires,
they explore the universe and themselves, and they clash like titans.
Even as we break down the secrets of life, and incorporate them into our
consciousness, we have not even begun to understand what it means to
"understand", and what understanding is.  Excited as I was by this idea, 
there seemed to be nothing that I could do with it.

My mind then turned to ask the question, what is an idea?  What are the
forms that it can take? In particular, what are the written forms?  What
sort of text can I generate such that it would be infectious?
(Obviously, the written English word will do.)  But what sort of text
would prove infectious to an alien intelligence? What is the vocabulary?
how does one generate a vocabulary?  How can an idea be written, so that, 
ab initio, it carries its own vocabulary with it?  I gave up, stymied.

Julian Jaynes provides the solution: the construction of meaning through
metaphor.   He introduces the terms "metapherand": the thing being
talked about/explained, and "metaphier": the language used to explain
the thing being talked about.  Eventually, through usage and
familiarity, the metaphier comes to stand for the metapherand, and thus,
the vocabulary has become enlarged.  This doesn't seem to be such a
revelation now that I write it; non-the-less, I guess it identifies
the pieces out of which the construction is to be made.

Next, one realizes that ideas have two forms: the inactive, "inorganic"
written form, and the active, "organic" form, where they operate, grow
and mutate.  I always enjoyed the sociological idea of structuralism; my
main critique of it seemed to be that it was a static thing.  I would
prefer to think of a living idea as some structure, some set of links
and connections, some set of states and transition rules, in short, a
running Turing machine (much as nineteenth century thinkers compared the
mind to a steam locomotive).

Where are these rambling thoughts taking me?  In the notes from a
previous week, I drew the distinction between "natural" and "artificial"
selection.  In natural selection, I had a local reward function,
rewarding individual entities based on their local behavior.
In artificial selection, there is a global reward function, by which genes
are selected.  The problem I had with &quot;growing" an automaton that
understood language was that I could not envision a global reward
function.  In fact, I'm now willing to assert that there is no such
function.  Rather, the correct way to grow an intelligent machine
is to provide a local reward function, and let survival of the fittest
to take care of the rest.  The local function rewards the ability to
manipulate concepts, and to lash concepts together through metaphor.
Can I devise such a machine? Not yet; but I've made some forward
progress.


October 1, 1992
---------------
OK, back to plants.  This genetic algorithm stuff seems to have been a
failure.  Why did the last plant experiment fail to evolve superior
forms? Is it indicative of a failure of the genetic algorithm?  I read
that SciAm article on it and it sounded like voodoo written by a
crackpot.  The arguments seemed to be totally bullshit.  The failure of
my system to evolve seems to prop up this assertion.  However, it may
also be due to bad code.  I've got one more hypothesis, one more trick
to try.

What seemed to be happening on the screen was that I'd get a dense mat
of unstructured forms near the surface, and that's it.  New genetic
combinations seemed unable to punch through this mat.  Is it possible
that the offspring were just cut off from water and light?  I'm thinking
of modifying the water distribution algorithm.  Currently, I loop
through the entities.  Each entity identifies a neighbor, and then
passes such a neighbor some water.  It ignores other neighbors.  Perhaps
the new offspring were such ignored neighbors. What if I change the
algorithm so that an entity spews out water, and each neighbor partakes
equally of it.  Might this work better?

There are even more complex schemes I can imagine.  I could have rules
that determine how much water an entity wants to soak up out of the
environment.  I could have rules that depended on the current amount of
water & light net worth.  I could have two types of water, type A and 
type B, with some entities perceiving one as poisonous, and the other as
nourishing, and for other entities, v.v. A primitive system for
self-defense, as it were.  However, in the interest of fooling with only
one variable at a time, so that I have some hope of figuring out whats
going on, I elect to do the "omni-directional" water version first.  The
old code is in life/plants/genes/uni, with the omni code in
life/plants/genes/omni.


10 October 1992
---------------
Actually, before doing "omni", I'm trying a different variation.  One
thing I "rediscovered" about how I had coded things: every rule
identified a neighbor and a neighbor type.  A rule match occurs if there
really is a neighbor in the specified location, of the specified type.
It would then be this identified neighbor that is given water.

First, I'm making a change to the way water is given: it will be given
to the bud(s) -- the amount of water given is given by the rule; water
is given equally to each bud.  Since this watering mechanism is somewhat
out of kilter with respect to the neighbor selection thing, I think I'll
fall back, and simplify the rule selection scheme by knocking out the
selection criteria based on neighbor location and neighbor type.  We'll
see where this gets us.  I think I'll also knock out the "artificial
selection" as well, for the moment.


16 October 1992
---------------
Found and fixed the bug refered to at the top of the July 12th 1992 note
above.  The destroy_plant() was incorrectly skipping forward in the
entity chain, thus passing by certain entities that should have been
destroyed.  Those entities remained attached to their destroyed genes
--- thus when the genes got recycled, these entities used the new set of
genes.

Preliminary runs seem to indicate that fixing the destroy function does
not qualitatively alter previous observations.

I'd still like to now why evolution seems to have come to a stop in
these experiments, and why new genotypes won't "take".  I'm not sure how
to go about this.

Decided that I need to keep population wide statistics, not
gene-centered statistics.  The gene centered statistics just generate
too much data which I am not able to analyze due to lack of tools.  The
following population statistics should be interesting:

average age of population,
average genotype age,
average entity age,
number of buds,
average length of bud,
average water content,
average light worth,
average length of rule chains,
average number of rules scanned before matching rule found,
percent of original population remaining after n generations
average percent of time each entity exhibits a state.

Ideally, I should be developing a mathematical model of what I'm doing,
and comparing to the actual simulation outcomes.  Unfortunately, I'm a
tad too lazy to do this.  It seems awfully hard to do this, and hardly
seems worth the effort.

Note that some plants are anchored to the ground, and some are not.  Due
to the development of a thick mat, it is possible that the new
genotypes, which are dropped at the top of the mat, are simply starved
for water. (However, they can grow down, and aught to be able to send
down roots. However, they don't get light down there, so maybe they stop
growing down.  Maybe I need to alter the rules so that entities can
propagate light energy downwards. ?? !!!!)  Anyway, this suggests a
couple of new statistics:

percentage of plants not anchored on the ground.
percentage of failed genotypes that are not anchored on the ground.
ratio of failure rate of genes that start on ground vs. those that do not.


22 October 1992
---------------
Machine asserted with flashing c6-02.


24 October 1992
---------------
Complete revamp of statistics gathering.  All stats are now anchored as
part of gene.  Am keeping track of all sorts of interesting stuff.


26 October 1992
---------------
Thinking of applying for NSF grant.  Contacted John Werth (H) 346-2768
(W) 471-9583.  John says that grants originating in UT must first be
approved by UT's Office of Sponsored Research.

Be sure to tie into the following in grant:

(1) Random resistive & fuse networks.   Analytical techniques apply, esp. 
    use of renormalization group.  Commercial applications include percolation,
    (applied by oil industry), tearing & fracturing (applied in material
    sciences), electrical breakdown of insulators (applied in power
    distribution systems, and in semiconductor research).
    Get references.

(2) Random Boolean networks (nodes are boolean operators).  Such networks
    have been demonstrated to be capable of learning in a fashion similar 
    to neural nets.  Such networks apparently can also model gene
    expressing in living organisms.  Get references from that SciAm
    article.

(3) Neural nets.  Nodes in a neural net (neurons) sum together inputs,
    and run result of such sum through an action potential.  Output is
    fed to other nodes in the net.  Neural nets have been shown to be
    capable of learning.  There is a nascent drive to commercialize
    neural net technology.

(4) Artificial Life.  Artificial entities are endowed with abilities to
    sense their surroundings.  Such entities are set loose in an
    artificial environment and are allowed to interact, breed, and evolve.
    Need references.

(5) In this proposal, the researcher wishes to study a construct that
    is a generalization of the concept of a neural net, a boolean
    net, and certain restricted forms of artificial life.  Nodes of 
    this network are represented by Turing machines.  Each node accepts 
    inputs from other nodes that are connected to it. Each node performs
    a calculation, generating an output, that is then transmitted to
    other nodes on the net.

    If the computation at each node of the net is made to be a sum of
    floating point inputs, followed by an action potential, then such 
    a net would be a neural net.

    If the computation at each node of the net is made to be a Boolean
    operation of binary inputs, then such a net would be a boolean net.

    If the network is not a static, fixed network, but one whose
    connections are dynamically made and broken, then the system
    resembles free entities that can interact with one another.

    The researcher believes that a network of Turing machines may be
    able to learn faster than ordinary neural nets, may use fewer
    computational cycles to arrive at an equivalent result, and may be
    easier to implement in hardware (since such Turing nets are
    inherently digital, and do not require large analog transistors 
    that ordinary neural nets would.)
   

26 October 1992, cont.
----------------------
OK. Back to plants.
Below follow a list of questions that need to be answered in order to
better understand the behavior of the genetic plant system that I am
toying with.

Observation:
I have observed that evolution in my system seems to slow to a crawl, 
and appears to slow further and further as time goes on.  Examining the
number of original genes that remain after a a period of time, this
number at first drops rapidly, but then drops more and more slowly.  In
fact, it appears to drop slower than logarithmically (i.e. even after a
an infinite time, some percentage of the original gene population will
remain.)  (This is for the system with artificial selection turned on.).
The "worth" measure seems to stagnate even sooner. (See data file
10.24.1992)  I abstract some of this below:

          % of orig                      average #       (delta)
cycles    genes still      worth         of rules        rule search
          around                         per gene        length
------    ----------       ------        --------        ------
0          1.00            -80.0           0.0             0.0
25         0.91            -63.7           25.0            12.3
50         0.80            -60.2           36.7            14.0
75         0.69            -57.7           42.3            16.3
100        0.65            -56.9           47.4            16.7
150        0.54            -53.4           43.6            18.3
200        0.52            -54.0           43.3            17.4
300        0.46            -54.5           40.7            20.3
400        0.42            -55.4           41.7            22.4
500        0.36            -56.2           38.6            24.0
750        0.34            -55.0           39.1            27.0
1000       0.31            -54.0           37.3            33.0

(In interpreting above, note that "winter cycles" occured every 2
cycles.)

Fraction of original genes still surviving seems to fit empirical formula:
              fraction (t) = 3.0 / cube_root (t)

I am still stymied by lack of analysis tools.

The following hypothesis need to be resolved:

(H1) Is it possible that newly created genes simply do not survive
     because they are not created in ground contact, and essentially
     "dry up" before they can establish themselves?  I can partly test
     this hypothesis by examining the establishment rate as a function
     of endowed "water" which each "seed" is given to work with.

(H2) It would also be nice to have a measure of failed genes that do not
     get ground contact, but this is tough -- is failure immediate
     failure, or failure after some amount of time?  Certainly, "seeds"
     with a lot of water in them can survive a lot longer before they 
     perish by dehydration.  How long is long?  How about an "average
     life of failed gene"?

(H3) Ground contact should not be necessary if new seeds fall on a
     "fertile loam" of previously existing entities which pass them
     water.  Is the loam failing to pass water to new trees?  (This 
     should be easy and relatively unambiguous to get stats on).  

(H4) How should the "fertility" of the "loam" be talked about?  How does
     the depth of the loam vary as a function of how much water is
     passed up the line? Is the depth an (negative) exponential of the
     amount of water each entity passes up the line?  Has the "loam"
     evolved into an antagonistic thing?  Would it be possible to evolve
     a "friendly" loam if plants passed light energy downwards? (i.e. if
     roots hidden from sunlight had a mechanism for being fed.)

(H5) Is our system in a region of criticality, or is it completely
     chaotic, or totally ordered?  What is a good measure of criticality?
     I need a measure of criticality that runs from 0.0 to 1.0, with
     0.0 being total order, 1.0 being total disorder, and something in
     the middle representing a critical system.  

     I believe that this measure of criticality can be derived by looking 
     at how rules are used for state transitions.  If every rule is 
     consulted with equal frequency, then the system is arguably
     chaotic. If a small number of rules are visited repeatedly, then the
     system is totally ordered.  If some rules are visited most of the
     time, but others are visited occasionally, then the system is
     critical. (I believe that this ratio of "visited most often" to
     "visited less often" should be stable with time, right?)

     Can it be that the system appears to be critical for a while, and
     then settles into chaos? (I am afraid that this might be happening).

     (Watch out for duplicate rules -- the duplicates will be visited
     never.)


28 October 1992
---------------
Wow!  Ran the new code for 60 hrs.  The trace is captured in 10.26.1992. 
Some fascinating things happened.  First, the population stagnated.  Then
(for some reason I do not yet know) all of the original genes perished.
A handful (10) genes remained.  These all had one rule each, and did
nothing but compete against one-another in a meaningless way.

This interesting thing here is to understand how all of the genes ended
up with no rules.  Normally, new genes get an "infinite" (3000) number
of rules to work with.  Thus, as a gene aged, and encountered new
situations, it always had a bag of tricks to work from.  Each rule that
got used get marked as having been used.

When genes were crossed, unused rules would be identified and discarded.
Then the actual cross would occur.  Thus, the new gene had far less than
3000 rules associated with it.  Furthermore, each of these rules was
initially marked (by default) as having been unused.  It would only get
remarked if it got used (and many/most applied to only rare situations).
Thus, if this gene ever got crossed again, the new product would have
even fewer rules, and so on, until the population completely lost
all rules that described rare situations.

I fixed this by always appending a large (3000) list of new, unused,
random rules to every gene cross.  Furthermore, I marked each of the
inherited rules as having been used at least once (so that they will
never, ever be culled again, for the life of the experiment.

This time, maybe evolution will stagnate, but at least it won't drive
the system into a trivial corner.

(BTW, I've changed things so that stats & selection take place AFTER
dead-branch-pruning, not before.)


1 November 1992
---------------

For comparison, lets look at the same program that generated the
10.24.1992 results, except that this time, we run a "select" cycle
(winter cycle) every 10 years instead of every 2 years.  Below follows an
abstract of the data found in 10.26.1992.

          % of orig                      average #       (delta)
cycles    genes still      worth         of rules        rule search
          around                         per gene        length
------    ----------       ------        --------        ------
0          1.00            -80.0           0.0              0.0
50         0.954           -66.2           41.7            30.1 
100        0.923           -64.0           54.6            32.5
150        0.862           -64.5           58.5            31.7
200        0.846           -63.5           64.0            33.3
300        0.754           -65.9           70.9            34.3
400        0.754           -63.5           74.8            40.0
500        0.708           -62.2           75.2            40.0
750        0.692           -62.5           78.3            42.0
1000       0.677           -63.6           80.3            42.6
2000       0.607           -65.5           81.1            45.7
3000       0.540           -65.5           79.9            45.5
4000       0.566           -65.8           82.3            48.8
5000       0.522           -64.4           76.7            51.9
7500       0.545           -64.3           85.4            55.7
10000      0.211           -50.4           26.6            48.8
15000      0.0             -55.0           1.0              1.0

(In interpreting above, note that "winter cycles" occured every 10
cycles.)

Comparing this data to the "2-year" data, there is no obvious scaling.
Rescaling the time axis by 10/2 = 5 does not bring us into line with
earlier data.  The dwindling of the original population is slower, the
average "worth" is not as good, and the average number of rules
associated with each gene is about double.

As pointed out above, this program had a "fatal flaw" -- it reset gene
use counts to -1 (unused) after a gene crossing.  Thus, crossed genes
tended to loose important, but rarely applied rule segments; eventually,
the population drops off into a single-rule, random population.  Note
the precipitous drop in the average number of rules between the years
7500 and 15000.

In particular, note that, as before, the system, for a while, seems to
stagnate, as the %orig of population stops changing and the "worth"
stops going down.  Until the system falls of its gene-rejection edge, one
can argue that evolution has come to a grinding halt for the above
systems.  

I believe that we can conclude (based on the experiments still to be
analyzed below) that rarely-used rules are critical for the health,
vigor and adaptiveness of a gene.  That is, a gene must have in stock a
set of rules for dealing with "unusual" situations; and that the more
unusual situations a gene can counter, the more ecologically successful
it will be.

(H6) Provide a measure for the number of "unusual" situations a gene can
     deal with.  Keep track of that measure for the population.

--------

Correcting this mistake, we have several runs.  The first is 10.28.1992,
where the population really did seem to be "evolving":

          % of orig                      average #       (delta)
cycles    genes still      worth         of rules        rule search
          around                         per gene        length
------    ----------       ------        --------        ------
0          1.00            -81.0           0.0              0.0
20         0.985           -67.0          24.0            16.8
40         0.954           -72.0          40.0            31.3
60         0.923           -65.4          50.0            36.3
80         0.892           -64.5          58.0            39.7
100        0.877           -64.8          63.2            41.8
150        0.815           -62.9          73.5            45.6
200        0.754           -62.1          79.3            41.5
300        0.615           -59.1          85.3            44.1
400        0.554           -57.9          85.7            44.4
500        0.523           -56.7          85.5            42.7
750        0.446           -58.1          89.5            44.6
1000       0.354           -57.2          90.5            43.5
1500       0.246           -57.5          92.7            43.1
2000       0.154           -55.0          92.5            43.9
3000       0.108           -50.2          80.8            45.8
4000       0.062           -41.9          57.3            50.2
5000       0.046           -39.7          57.4            56.2

(10.29.1992 is substantially the same, from what I have of it).

The porig has a very nice curve when plotted on log paper.  
Between years    0 and  400, porig = 2 ** (-(2.2(+/-0.2)e-3) * t)
Between years  400 and 2000, porig = 2 ** (-(1.2(+/-0.1)e-3) * t)
Between years 2000 and 4000, porig = 2 ** (-(0.63(+/-0.1)e-3) * t)

Note that for small t, porig = e ** (-(t/10) * (1/(66(+/-6)))).  Note 
that 66(+/-6) is equal to the number of genes in the original gene pool.
That is, for small t, there is an almost one-hundred percent chance
that a new gene will replace one of the original genes.  How can we
understand this?  Suppose the original random gene pool was very large.
Then, chances are that any new gene, randomly created, is better than
SOME gene in the pool.  Thus, initially, we are constantly replacing an
existing gene with a new gene.  This continues on until about half the
genes have been replaced.  Now, any new gene, randomly created, has only
a 50-50 chance of being better than any already in the population.
Thus, we see the exponent halve.  This reasoning is almost sufficient to
allow us to predict the shape of the gene replacement curve (almost but
not quite).

porig = exp ( (t/tw) * (1/N) * p(t/tw) )

t = time
tw = winter cycles (usually = 10)
N = gene population (usually = 65)
p(t) = probability that a randomly created gene is better than one that
       exists already.

We have already shown that 
p(0) ~= 1.0
p(1) ~= ((N-1)/N) + 1/(2*N)

To get further, one has to make assumptions.  If one assumes a 100
percent chance of replacing existing genes, but only 0.5 chance of
replacing new genes, then one gets
p(m) = p(m-1) * (N-1)/N + 1/(2*N)  // I don't think this recursion is right !!
     = 0.5 * (1 + ((N-1)/N)**m)

Despite that I think I muffed the recursion, this formula seems good at
predicting porig through year 5000.  However, I don't think its really
correct ...

Here is what the formula gives ---

year =    0   m = 0, prob of replcmnt = 1.000000, percnt orig = 1.000000 
year =  100  m = 10, prob of replcmnt = 0.928190, percnt orig = 0.866929 
year =  200  m = 20, prob of replcmnt = 0.866693, percnt orig = 0.765922 
year =  300  m = 30, prob of replcmnt = 0.814028, percnt orig = 0.686805 
year =  400  m = 40, prob of replcmnt = 0.768927, percnt orig = 0.623014 
year =  500  m = 50, prob of replcmnt = 0.730304, percnt orig = 0.570198 
year = 1000 m = 100, prob of replcmnt = 0.606080, percnt orig = 0.393596 
year = 1500 m = 150, prob of replcmnt = 0.548861, percnt orig = 0.281787 
year = 2000 m = 200, prob of replcmnt = 0.522506, percnt orig = 0.200346 
year = 3000 m = 300, prob of replcmnt = 0.504775, percnt orig = 0.097322 
year = 4000 m = 400, prob of replcmnt = 0.501013, percnt orig = 0.045814 
year = 5000 m = 500, prob of replcmnt = 0.500215, percnt orig = 0.021326 

It overestimates slightly in the range 500-3000, gets 4000 on the nose,
and then proceeds to underestimate the percentage of the original
population.


2 November 1992
---------------
Bad news. The file 10.29.1992 is as long as it is because the system
core dumped.  I ran it again, and it core dumped in exactly the same
place.  Sounds like a memory leak, I hope its not skewing all of my
results. Amazing it ran for 8550 cycles before croaking!

Anyway, need to have more stats. 

(H7) What is the distribution of worth for randomly created entities?  Is
     it a bell curve? If so, why?  What would the distribution be if,
     instead of killing the least fit gene, I randomly killed any gene?
     What is the distribution for the evolving system?  Can any of these
     distributions be predicted by analytical methods?

7 November 1992
---------------
Patty writes:
The plants: the MOVIE   I cannot wait. Let's sink our 
life- savings into this project and get GOOOOOD Help. 
The top director, Tom Cruise, Bruce Willis, Julia Roberts 
as.... THE PLANTS. Linas AND LINAS AS ....THE FUNGAS GOD!!!!!!!!!!!
LINAS FUNGAS, the Lithuanian-American Mad scientist and
evil creator of ....THE PLANTS....coming, to a theater 
near you. Probably too near.

26 November 1992
----------------
A whole bunch of stuff to report.  First, I tried solving, by analytical
means, a repeated integral of the error function, and bumped into a grim
hyper-geometric series. Here goes the logic for trying to predict the
gene replacement rate analytically:

1) The initial random set of genes has a distribution of worths.  Lets
   assume that the distribution is a Gaussian (see H7).  That is, the
   distribution D(x) = 1/(a*sqrt(PI)) exp (- (x/a) **2)

2) Consider the definite integral:
   I[1] = Int {from -inf to +inf} dx D(x)
   This represents the probability of generating a new random gene that
   has a higher net worth than an existing gene (of an infinitely large 
   gene pool). The value of this integral is one.

3) Define the indefinite integral:
   I[1](x) = Int {from +x to +inf} dy D(y)
   The probability of picking a second gene (randomly) that has a higher 
   worth than a first (randomly chosen) gene is then the definite
   integral:
   I[2] = Int {from -inf to +inf} dx D(x) I[1](x)

   Defining the indefinite integral:
   I[n](x) = Int {from +x to +inf} dy D(y) I[n-1](y)
   we see that I[n](-inf) == I[n] is the probability of picking n
   genes (randomly), each having a worth greater than the one before.

   For Gaussian D(x), we have:
   I[n](x) = (1/n!) * (1/2**n) * (erfc(ax)) ** n
   and 
   I[n] == I[n](-inf) = 1/n!

We do not have so much luck attempting to compute the expected value of
these successive picks.  The first two expected values are easily
solvable, but the third is not.  The best I could do after long attempts
was a nasty sum over hyper-geometric series.

The second thing to report is that I have let these things run for over
fifty thousand cycles.  One problem is that the plants eventually run up
all the way to the top of the world (this happens somewhere around 20K
years).  Second, after a while, genes start dying, and the gene
population starts to shrink (again, this starts happening in the
vicinity of 20K years). 

Finally, I decided to try to make a videotape of the plants
growing.Towards this end, I have written a GL-based movie viewer that
can be found in genes/movie.  To generate the the frames, compile with
NTSC_MOVIE defined.  Bitmaps will be dumped to files, in a 640x480
world.

27 November 1992
----------------
Got a graphing program, xmgr, off the net.  Took something like 15 hours
to compile on the RT!! It works; a bit buggy... a bunch of osf keysym
translation table entries are invalid ... also, judging from the layout
of some of the widgets, it looks like I have a buggy libXm.a as well.
Overall, xmgr is a bit bulky and slow on the RT.  Seems to have a lot of
interesting options, but I didn't get any documentation. Hard to drive
just by guessing.

It only accepts data in an XY format; had to write a filter for my file
format to generate the XY data that this thing wants.  Code for this
filter can be found in life/plants/filter.

xgraph is MUCH easier to use.

The graphs sure do look interesting...

2 December 1992
---------------
Modified all code to give a more robust error exit.

Beginning to run a series of experiments to see how things vary with:
-- world size
-- initial population
-- number of internal states
-- number of external types expressed
-- amount of water needed to bud
-- amount of light needed to bud
-- amount of water transfered per step

Discovered that in the simulations I'm running, the amount of water
transfered is alarmingly high -- 100 per cycle max ... no wonder the
plants reached the top of the world in no time (about 2000 cycles for
the test run marked 12.03.1992).

The file 12.03.1992 contains a long run with a 1000x600 world, passing
100 units of water, etc.  (More below)

5 December 1992
---------------
Implemented several stochastic analysis tools 
-- a Gaussian random number generator
-- a Fourier transform routine
-- a routine that measures the RMS value of the first derivative of
   smoothed data (the data is first smoothed with a Gaussian, the first
   derivative is calculated, and finally the RMS variation of the first
   derivative over the entire data set is taken.

Because I am not familiar with the above measure, I tried it out on whie
noise, Gaussian noise, and a Brownian process.  I note that the RMS
derivative goes as (sigma ** (-3/2)) for both the white and Gaussian
noise, and as (sigma ** (-1/2)) for the Brownian process, where sigma is
the width of the Gaussian with which the data was smoothed.

My original reason for inventing the first-derivative thing was the
realization that this could provide a very useful measure to understand
the scaling of fractal data.  The idea is that if a fractal truly
scales, then the measure goes as (sigma ** (-1/2 - fractalDim)) or
something to that effect (something uniform).  If the data does not
scale (does not have a fractal dimension), then this measure would
indicate this.  I'm pretty psyched; I'll have to do some analytical work
to see how fractals should behave with this thing.

(Another alternative is to smooth not with a Gaussian, but with a cubic.
Wonder what relation this has to wavelet analysis???)

(One problem with this measure seems to be that its fairly "noisy" --
you need an awful lot of data to get a smooth curve for two orders of
magnitude of scaling, if not for just one... Burns a lot of CPU).

I have yet to unleash these tools on the data that I've collected.

Source code is currently located in  /src/linas/chaos/stochastic

More accurately:
For Brownian motion, RMS deriv = 0.635 (+/- 0.015) * sigma ** (-0.5)
For Gaussian noise,  RMS deriv = 0.640 (+/- 0.020) * sigma ** (-1.5)
For white noise,     RMS deriv = 0.178 (+/- 0.008) * sigma ** (-1.5)

Data files (in xgraph format) that demonstrate this can be found in 
/src/linas/chaos/stochastic/dat/*.dat

6 December 1992
---------------
Moved around source to simplify paths and makefiles. Source moved to
/src/tools.

Finally taking a look at some of the data.  pn in 12.03.1992 appears to
have a noisy power spectrum, scaling as (omega ** (1.6 (+/-0.2))).  I am
glad to note that the spectrum is NOT omega squared.

For this file, pgw also seems to have a similar scale, but seems
considerably less noisy -- seems to have a very pronounced series of
harmonics ... 31 in all, up to the point were we start aliasing (nyquist
limit?)  Go figure ...  pra is noisy ...

Actually, increasing the resolution along the frequency axis made the
"harmonics" go away --- I suspect that I am seeing a weird aliasing
artifact due to the fact that I'm keeping only three or four decimal
places of resolution in my data ...

Am printing a bunch of graphs, with the data scaled by (omega ** 1.6)
-- I think I should print them again on semi-log ...  I betcha I won't
get a straight line when I plot log power spectrum vs. frequency ...


Plot files are stored in life/plants/genes/waterneigh/ps, in postscript
form.  The naming convention is ..ps, where
 is pra, pgw, pdrsla, etc.  Some files have a one character
prefix:
s -- contains power spectrum
l -- contains log10 power spectrum
v -- contains that differential variance thing.

--------
It occurs to me that now that I have a Gaussian noise source, I can
compute that population genetic thing numerically ...

11 December 1992
----------------
Profiled the execution of the darned thing.  Turned out that 75% of the
CPU time was being spent in "irradiate_entity".  I redesigned this to
operate in parallel, and got a very dramatic speedup.

Worked on Motif interface.  Motif programming is a bitch.  I'm drawn
between using AIC and trying to learn Motif better by doing it myself.

12 December 1992
----------------
Modified/fixed bug in the transition engine.  The problem was complex,
so I describe it here so that I don't make this mistake again.  The
problem was that I had things set up so that a rule ALWAYS had to
identify a neighbor.  I had done this, so that neighbors could be
watered properly.  However, what happened if an entity had no neighbor?
Then the default rule was chosen.  Not a pretty situation.

OK, so I fixed this by having some rules not identify neighbors, thus
matching whether or not there is a neighbor.  Note that as a result,
these rules cause no water to be passed to anybody.   Thus, I would
imagine that selective pressures would cause rules that identify
neighbors to be searched first, and then finally, (maybe because there
are no neighbors), the "don't care if neighbor" rules is selected.  In
particular, this means that the state transitions are NOT independent of
the ordering of the rules, and that therefore, rules should NOT be
sorted to improve performance.

There are things I both like and hate about the directionality of giving
water.  On the one hand, I would like to convert the system so that
water is given equally to all neighbors.  On the other hand, This would
result in geometrically (exponentially!) less water being transfered to
entities farther from the ground.  This would dramatically limit growth.
(That is, it limits growth, because I have things set up so that only a
small, finite amount of water is passed per cycle.  I suppose that I
could set things up so that ALL water is given away each cycle ... or I
could set things up so that an entity keeps a minimum amount of water
for itself (ensuring its own survival), and passing the excess to
any/all neighbors. Hmmmm.)


(H8) Is the system evolving so that more water is passed upwards than is
     passed downwards?  There would seem to be no selective pressure on 
     this, since all we are selecting for is for a balanced set of limb 
     lengths.  If a trait is not actively selected for, is it changing 
     because it is being indirectly selected for, or because it is along 
     for the ride?

(H9) How do we measure genetic diversity?  How do we recognize a
     species? (Presumably we can recognize a phase transition if 
     we plot the correct variables on a scatter plot.  What should 
     these variables be? Suppose we plot genetic difference vs. 
     some trait -- this should do it, right? How should we measure 
     genetic difference? Does special separation result as a phase 
     transition? Is it a first order or a weaker phase transition? 


I was given the idea that I should be measuring the "Shannon Entropy H",
which is given by:

H = sum (over states i)  P(i) log P(i)

Where P(i) is the probability of being in state i.  I am not entirely
sure what I will be doing with this statistic ...

As for special similarities, how about this measure: Let f(n)[i,j] be 
the fractional length of a genetic fragment n that is shared between
gene i and gene j.  Then define

Hd [i,j] = sum (over fragments n) f(n)[i,j] log f(n)[i,j]

Ad [i,j] = sum (over fragments n) f(n)[i,j] ** 2

How should I define f(n): is it the fraction of the length of gene i, or
of gene j? How should I define the fractional length in a symmetric way?
Maybe I should let f(n) be the product of the fractions for gene i and j ...
Yeah ... I like that...  If I do it that way, then presumably we should
have Hd [i,j] = Hd [j,i].

Looks like I have my work cut out for me ...

13 December 1992
----------------
Added the measure of Shannon entropy to the stats package.  In the
process, found a SERIOUS bug in the gene-crossing algorithm -- rules
were being marked as unused, after being copied, and then the cull
routine was being called on the copies, resulting in all rules being
culled ...  Argh, It is SO frustrating to keep on finding bugs in this
system.  It really doesn't raise my confidence level.

Patty says:
may rocky do big doo-doo in your garden

Added the measure of genetic diversity.  Have it set up so that the full
correlation table prints out every 20 winter cycles.  This thing chews
up LOTS of cycles, since it has to make 65*3000*3000 rule comparisons
(since each gene has 3000 or so rules associated with it ... most of
them unused, of course ...).  I guess an obvious optimization to make
would be to cull out the unused rules before doing the comparisons ...
We'll implement this some other day.  It was hard enough to code this up
as it was.

I still haven't figured out H9 -- what DO I plot?

16 December 1992
----------------
Worked out the last few bugs in the genetic diversity code.  Basically,
for gene pairs (i,j), I will be reporting:

Gd [i,j] = sum (over fragments n) g(n)[i,j] 

where g(n)[i,j] = f(n)[i] * f(n)[j]
where f(n)[i] = (length of fragment n) / (length of gene i)
where f(n)[j] = (length of fragment n) / (length of gene j)

One would think that one should have Gd [i,j] = Gd [j,i].  I've got my
code set up so as to test for this (and test my algorithm).  However, it
turns out that discovering common fragments, and measuring their lengths
is not as easy as it might sound.  One can have both embedded matched
fragments, and overlapping matched fragments.  The current algorithm
that I'm using to compute Gd is not symmetric in i and j if these genes
contain overlapping and/or embedded fragments.  I guess we'll find out
how often this occurs ...

I don't want to go to a more complex fragments search algorithm because
(1) the complexity will make this routine hard to design/debug,
(2) it'll chew up a WHOLE lot more CPU time than my current routine,
which is already chewing up 10-20% of the total time (even though it
runs infrequently).

Got an idea for (H9) -- try a scatter-plot of pairs (, Gd[i,j])
where

 = avg (over j) Gd[i,j]

we'll try this, and see what we get ...

After implementing the speedups, this program runs something like ten
times faster than before. At this rate, I ought to be able to collect a
flood of data in a very short period of time.

Watching it run has given me an opportunity
to rethink basic questions.  First, it sure would be nice to implement
the system so that it can pass light energy about, so that entities in
shade can get light energy.

(H10) How would implementing transferable light energy change the
system?

Also, up until this point, I really haven't thought about the issue of
awareness of surroundings.  Lets approach this this way: I've set up a
rule for artificial selection that encourages plants with evenly spaced
branches, about 10 pixels apart.  Presumably, the effect that this will
have on the population is that it will learn how to count to ten, and
then bud and/or go dormant.  However, the budding is not completely
under the control of the entities themselves.  In part, they need both
water and light to bud.  However, their internal states and state
transition rules are completely insensitive to the amount of energy they
have stored up!  Of course, the genes will do the best they can, but at
the moment, they have no awareness of a characteristic that is quite
important for their survival!  Essentially, the gene population cannot
be as adaptive to the situation as it might need to be...
With awareness of ones own water content, one could make decisions on
how much water to pass to neighbors, and thus promote survival by
not handing out water if one doesn't have enough.

(H11) How does awareness of surroundings affect the adaptability of a
      genetic system?


22 December 1992
----------------
In order to properly calibrate my tools for computing spectra, I
created a fractal noise generator, and cranked some noise through my
tools.   The fractal noise is parameterized by a parameter H, which
measures the self-similarity/scaling of the noise as the time-dimension
is rescaled.  To analyze the data, I have three tools:

 o Fourier Power Spectrum -- the power content of a frequency f should
   scale as  1 / f**(2H+1).
 o Stationarity -- a nickname I've given to Var [ X(t2) -X(t1) ], which
   essentially measures how fast a non-stationary process drifts off.
   For a time DeltaT, the stationarity (variance) should scale as
   DeltaT**2H
 o RMS Derivative -- this tool is explained above.  Including a
   multiplicative factor of sigma, I hypothesize that for a feature size
   of sigma, the RMS derivative should scale as sigma**(H-0.5).

Pure Brownian motion has H=0.5, and these tools check out pretty good on
this.  I've also generated fractal Brownian motion using the midpoint
displacement algorithm published in Saupe & Peitgen.  Here, I see some
problems.  For H=0.1, the stationarity seems to indicate that H is
closer to 0.2 or 0.3, not 0.1.  However, the power spectrum looks good.
For H=0.9, the power spectrum would indicate that H never got above
0.55,  although the stationarity looks about right (although its a little
low). For H=0.5, everything seems to be right on the button (although a
tad less stable than Brownian motion generated by more ordinary
algorithms.

Right now, my conjecture is that the midpoint algorithm is not very
robust; and that a better algorithm is needed.

23 December 1992
----------------
Spent some more time trying to calibrate my noisy tools.  The Saupe &
Peitgen book is filled with B.S. Integrating H=0.5 Brownian Noise does
NOT yield H=1.5 noise.  The power spectrum still showed a one over freq
squared distribution (pointing to an effective H=0.5, NOT H=1.5).
However, the stationarity was about 2.0, pointing to an effective H=1.0
-- very interesting.  The RMS derivative equaled 0.5; if my conjecture
that it equals H-0.5, this would again confirm an effective H=1.0.

I think Peitgen & Saupe are a bit glib when they say that integrating
noise increments H by a factor of 1.0, and differentiating it decreases
it by a factor of 1.0.  I think that they are missing the random process
equivalent of poles at H=0.0 and H=1.0, preventing naive "analytic
continuation" beyond these points. Now, If I could only understand what
the random-process analog of "analytic continuation" was ...
(and I hope that they will forgive my very own glibness and errors ...)

In addition, I tested a conjecture I had made.  I conjectured that the
reason I was having trouble getting frequencies to fall off faster than
1/f**2 was because the midpoint recursion algorithm essentially composed
the noise as a series of straight lines.  Now, a straight line has a
power spectrum of 1/f**2, so -- no surprise.  However, a parabola has a
power spectrum of 1/f**4, so maybe if I replaced the midpoint recursion
algorithm by one which did quadratic interpolation to get the midpoints,
I'd be able to see faster roll-off with frequency.  Alas, I did not.  In
all respects, the quadratic interpolation midpoint algorithm seemed to
behave just as the liner interpolation midpoint algorithm. Oh well.


23 September 1993
-----------------
It is important to realize that there are a number of different physical
environments that might be interesting to explore.  Briefly
-- number of types of nutrients needed to grow/survive.
-- availability of nutrients
   o can come from neighbors (e.g. water transmitted from roots)
   o can come from immediate environment 
     + immediately available at constant density (e.g. air)
     + immediately available but subject to change (e.g. light & shadow)
     + immediately available, but subject to random variation
     + limited availability (depleatable resource) (e.g. sugar)
     + depletable resource, replenished by diffusion (e.g. water in soil)

-- mobility
   o partial mobility -- development of lower branches can push high 
     branches away (e.g.  grass, trees)
   o no mobility -- cells stay where they are born.

-- space
   o at most one entity occupies a given location, vs. multiple entities
     at same location.  If exclusion exists, then new buds cannot grow
     into occupied spaces. If no mobility, then buds cannot push old
     branches out of the way.

-- connectedness 
   o cells can transmit nutrients, and exchange info with physical
     neighbors, or with logical neighbors. A logical neighbor may 
     be an offspring that is not necessarily in physical contact.
     offspring can be identified by linked list or binary tree.

-- mutation


I want to explore some of these different universes. Will do so by
adding #defines to various files:
#define RELOCATION -- allows branches to be relocated by growing buds.

-- problem that I have with relocation is that things can get very
   chopped up, when it gets crowded. (more explanation needed here).

-- overcastness -- every entity gets an equal amount of sunlight,
   independent of position. No shadows are cast.  Even though I call this
   "sunlight" in the C code, the model more closely resembles that of cells
   getting an essential nutrient from a medium in a petri dish. (all the
   more so because our model is two dimensional.)

-- reflective boundary conditions. Though about a dual nutrient system,
   with the plants living between two boundaries.  Thus, plants above a
   line would get nutrient A automatically, while those below the line
   would get nutrient B.  All entities need both to survive and grow.
   Thus, each side would have to learn to pass each to the other side.

   In fact, we don't have to necessarily implement both sides to study
   this phenomenon.  There are two things one can do: (1) entities on the
   ground plane have access to an infinite amount of nutrient A (water)
   and must pass it up the chain. Call this "infinite" boundary
   condition. (2) Entities must pass nutrient B down to the ground
   plane, where it is converted to nutrient A, (only in the amount passed
   down).  This is like a "reflective" boundary condition, which forces a
   symmetry between the two sides.

   But - for interesting innervation to be seen, I suspect that light
   would have to be distributed in a diffusion-limited fashion.  In such a
   case, there would be selective pressures that would encourage a
   maximization of surface area, so that more light could be picked up.
   Also, it would be pointless to have branches too close to one-another
   since they would compete for the same limited amount of light.


(H?) Morphology. Describe in words the morphology seen.



28 December 1993
----------------
Wrote a little program that computes the correlation coefficient between
two different lines of data.  A sliding Gaussian is passed over the
data, so that we can look to see if the correlation in one part of the
time series is close to another part.  The program is in
/src/tools/src/cmds/numerical/correlation.c (The correlation
coefficient is the usual normalized covariance thing -- it runs
between +1 and -1).

30 December 1993
----------------
Dang.  Just got done looking at a ton of data, and then discovered a bug
with the correlation filter that invalidates all of the data. Dang. 
Try again.

Running this program over the 12.18.1992 data set, we see the following:

The following pairs appear to be highly correlated:
 -- starting at 0.96, and rising to 1.0 more or less linearly.
 -- bumps and dips, 0.990 +/- 0.008, finishing at 0.999

 -- starting at 0.986, rapidly rising to 0.996 at 2K years,
               and then gradually to 1.0

 -- at 1.0 to within roundoff errors.

 -- starting at 1.0, dipping to 0.97 at year 5K, rising back to 1.0

 -- starting at 0.990, rising to 0.996, dipping to 0.964,
               rising to 0.996 again.

 -- bumps and dips, at 0.973 +/- 0.008
 -- bumps and dips, at 0.972 +/- 0.014


Strong negative correlation:
 -- wavers around negative 0.987 +/- 0.012 
 -- wavy trend from -0.984 at year 0 to to -0.998 by year 8K

pdruha appears to correlate with nothing else:
All of the following exhibit the same general behavior: high correlation 
till year 2K , then dropping to 0.2 correlation linearly by year 6K:
 -- 0.87 to year 2K, drop to 0.15 by year 8K
 -- 0.98 to year 2K, drop to 0.15 by year 8K
 -- 0.98 to year 2K then drop to 0.2 by year 8K
 -- -0.98 to year 2K going to -0.2 by year 8K

 -- big lump at 0.93+/- 0.03, till year 5K, then dropping to
                0.4 by year 8K

 -- remains correlated (at about 0.9 to 0.95) to year 6K, 
                then dropping rapidly to 0.4 by year 8K.

Correlation to pn:
 -- wavers around 0.88 +/- 0.06 till year 6K then drops to 0.65
 -- wavers around 0.98 +/- 0.02 till year 6K then drops to 0.68
 -- wavers around 0.94 +/- 0.04 till year 6K then drops to 0.68
 -- flat around 0.99 +/- 0.01 till year 5K then drops to 0.68
 -- wavers around 0.86 +/- 0.06 till year 6K then drops to 0.67
 -- wavers around 0.89 +/- 0.07 till year 6K then drops to 0.68
 -- drops from 0.96 at year 2K to 0.67 by year 8K
 -- flat at 0.98 to year 2K, waver around 0.92 +/- 0.03 to
              year 6K, then drop to 0.68 by year 8K

This ones negative: 
 -- flat at -0.985 +/- 0.003 to year 6K, going to -0.68 at year 8K


Weak correlation:
 -- around 0.9 to year 3K, drop to 0.5 by year 5K,
              and back to 0.9 by year 8K
 -- around 0.8 to year 3K, drop to 0.54 by year 5K,
             and up to 0.94 by year 8K
 -- around 0.93 to year 3K, drop to 0.55 by year 5K,
               and back to 0.94 by year 8K

 -- wavery climb from 0.88 at year 1K to 0.99 by year 8K
 -- waver around 0.86 +/- 0.05, eventually climbing to 0.99

The following are negative correlations:
 -- waver around -0.89 +/- 0.03, eventually finishing at -0.99
 -- start around -0.94 to year 3K, drop to -0.75 by year 5K,
             and back to -0.96 by year 8K
 -- big up-down wave between -0.90 to -0.985
 -- big up-down wave between -0.94 to -0.990

The following should be expected, since ptbla is a component
(indirectly) of pgw:
 -- big dip -- starts at -1.0 before year 2K, dips to -0.945
               at year 5K, then back to -1.0

 -- starts low -0.84, then rises-dips between -0.98 and -0.92

Other assorted odd correlations:
 -- big wave between 0.78 and 0.96 -- doesn't appear to fit
                with any of the above patterns
 -- big dip, between 0.985 and 0.89 -- again, its
                  different.


Hmm. I've gone back to look at these in greater detail; I see I used a
smearing factor that was too broad, and wiped out a lot of detail. In
fact, a lot of the drops of correlation are a lot sharper, and drop not
to 0.2, but to 0.0. I'll try again, and print up the sharper versions.