Week 3
I. Levels of Visualization
- Visualization of a single neural network, be it as a static weighted
graph, an animation, a report of its fitness score on various tests, or
what have you, I will call network visualization.
- Visualizing the "family tree" for a neural network evolved via the
genetic algorithm I will call heredity visualization.
- Finally, I will refer to the visualization of the evolution of an
entire population of neural networks as evolutionary visualization.
II. Heredity Visualization
- If we draw a standard 2-dimensional tree diagram, and we wish to
display n generations of the family tree, we are bounded
vertically by n, and horizontally by pn-1,
where p is the number of parents per child. Available space
will therefore impose limitations upon the maximum values of p
and n. Note that the number of networks in the total population
pool is not relevant here.
- Three possible views (could have buttons to switch between them):
- Selected network at the top, descendants branching out below.
- Selected network at the bottom, ancestors branching out above.
- Selected network at the center, ancestors and descendants above
and below.
- By double-clicking on a network, you can re-center the tree on that
network.
- Tree display should be zoomable.
- With the combination of re-centering and zooming, the user should be
able to compensate quite a bit for any scalability problems that might
arise.
III. Evolutionary Visualization
- The most basic feature would be a graph of each generation's maximum,
minimum, and average fitness vs. time.
- Should also plot generation diversity vs. time (See
Computing Diversity below).
- You should be able to look at each generation's fitness distribution.
- It would also be helpful to have a graph of volatility (see
Research section below), as well as volatility
distribution for each generation.
IV. Computing Diversity
- All neural networks in a population will have the same number of
input and output nodes, since they are all being trained to solve the
same problem. The three respects in which a pair of neural networks
can differ, therefore, are # of hidden nodes, edge weights, and
connectivity. The following procedures show how to combine these three
factors in a natural and intuitive way to produce a single figure for the
diversity of a population.
- To calculate Difference(N1, N2)
(the difference between two neural networks):
- If N1 and N2 have the same number of
hidden nodes, we can give them the same connectivity by introducing
"dummy" (0-weight) edges to make each network fully connected. Since
the two networks now differ only in their edge weights, we can simply
square the numerical differences between every pair of corresponding
edges, and sum the squares to produce a result. If there are w
edge weights, this approach is equivalent to taking the Euclidean
distance between the two networks in w-dimensional space.
- If the number of hidden nodes is different, we can simply
introduce an appropriate number of "dummy" nodes into the smaller of
the two networks, where a dummy node is connected to the other nodes
by 0-weight edges. Since 0-weight edges don't do anything, we are
in no way changing the behavior of the network by introducing these
extra nodes. We can then use the procedure for networks with the
same number of hidden nodes.
- To calculate Diversity(N1,N2, . . .
Np) (the diversity in a population of neural networks):
- Find which of the p networks has the largest number of
hidden nodes. Let this number be denoted by h.
- Using the methods employed by
Difference(N1, N2), transform all networks
in the population into fully connected networks with h hidden
nodes.
- The p neural networks can now be thought of as a collection
of p points in w-dimensional space, where w is
the number of nodes in each network (w=(h-1)2). To
find the "center" of these points, we simply sum the vectors for each
point (i.e. the vectors from the origin to that point), and divide by
p. We can then compute the Euclidean distance between this
center point and each of the other points, sum them up, and return the
result.
- This is similar to the way moment of inertia is computed in
physics.
- The diversity figure, when computed this way, will tend to be larger
for larger populations. It may also be useful to compute a normalized
diversity figure, i.e. the diversity figure given above divided by the
population size, so that comparisons may be made between population
pools of different size.
V. Center Networks
- Using the techniques described in the previous section, it is easy
to introduce a function
Center(N1,N2, . . . Np) which
returns the "center" or average network in a population, i.e. the network
whose weights correspond to the point in w-dimensional space
(w is the number of edges) which represents the average of the
p points corresponding to the networks in the population pool. Note
that this hypothetical network need not actually exist in the pool. This
Center() function may be useful for a number of things.
- By calculating the Difference() between the center of one
generation and the next, you can determine how much of a leap was made
between generations.
- By examining the center networks of a series of generations, you
may be able to get a general sense of the direction in which the
evolution is headed.
- When looking at the family tree for a network, you could examine the
center network for each generation in the tree, allowing you to more
readily see how a specific network came about.
- If the networks in each generation could be grouped into clusters of
similar networks, and the center computed for each cluster, it may be
possible to automatically produce the kind of ape-to-man diagrams that
you see in biology textbooks.
VI. Research
- Two types of breeding or recombination methods are used in
Evolution Strategies. In sexual recombination, a new individual is
created from a set of n parents, where n depends on the type
of sexual recombination being used. The other type is panmictic
recombination, where one parent is held fixed, and then, for each
component of the new individual (i.e. each edge weight), a second parent
is selected at random from the population pool. Panmictic recombination
thus involves (potentially) a much branchier family tree, and would
therefore be harder to visualize.
- When using Evolution Strategies, a number of mutation variables are
associated with each neural network (probably the standard deviations for
the mutations of each edge weight). By summing up these standard
deviations, you could obtain a volatility figure for a given
network, which would indicate its potential for change. The average
volatility for a generation could be computed, and, when graphed next to
the diversity, would tell you a lot about the way the evolution was
progressing. For instance, if volatility was low and diversity was
decreasing, you might conclude that the population was "zeroing in" on
a solution.
- I do not think it is necessary to modify my Difference()
function to incorporate differences in mutation parameters, since the
mutation parameters affect changes in the network across generations, and
the Difference() function is intended to operate on a pair of
neural networks as they exist in a particular moment in time. It may be
appropriate, however, to modify my network visualization to include some
indication of the volatility of each edge weight.