## Generating Bottom Up

### 19 Oct 2014

After yesterdays very easy tree generation I wondered if something I had recently
done in R (though not posted here yet) could be done easiy in `clojure.test.check`

as well. The below shows I got it to work, but there is the one point `next-step-fn`

that is really way too ugly. After I get some more familiarity with
`clojure.test.check`

I’ll come back and try to clean it up.

The basic problem is to generate a map with some values where the distributions of those values are not independent. For instance if the map has a key :a having value 1 makes it more likely for key :behavior to have value :act. Because I am thinking of this data to be used as input for machine learning tasks I call the keys :predictors. There is in the below example on special predictor :behavior, this is the one we focus on to give it a value that is random but dependent on the other predictor values.

The requires I used are as follows

## The Generation Graph

This graph describes how the datum we want has to be generated. It is a map where the keys are the vertices of a graph that is described, the values describe the edges. There are at this time three types of edges implemented:

- {}, the empty edge. When we get to a vertex with this as its value, the generation process is finished.
- ]} indicates that at leaving this vertex the predictor X will get a value. In the values map the freq values indicate the relative frequency of the different values, the :value obviously gives the value the predictor gets, and :next indicates what the next vertex to be active is.
- indicates that leaving this vertex :predictor X will get a value. This value will be generated using generator value-gen, and the next vertex will be :N

We currently experiment with the graph g1 exhibiting all three node types.

In this graph we have three predictors, :F1 :F2 and :behavior. :behavior can have values :act or :no-act, and the values of the other predictors influence the probabilities of these values. To easily generate different action behaviors we have the function acter.

Here then is the graph we want to generate along

Note: many functions below this have direct reference to g1. Clearly they should have this as an additional argument.

The next function is the main problem function. Clearly the use of gen/call-gen and (even worse) rose/pure are out of place. This is the function the user of bottom-up-gen should write, and this user should not have to bother with such details.

With this problem function in place, and working as intended (though not implemented as it should be) we can make the following definitions.

With that in place we can call the following

this call has the following result (depending on state of your random generator I believe). If you generate a larger number you can do the analysis and see that predictor :behavior is indeed dependent on the other predictors.