Implementation of DEvelopmentAl Learning (IDEAL) Course

Home » 4. Self-programming » 44. Constructing reality

The construction of reality in the developmental agent

Anticipating the effects of composite interactions

As showed in the algorithm in Table 33-3, the anticipate() function returns a list of anticipations. Each anticipation corresponds to an experiment associated with a proclivity value for performing this experiment. The proclivity is computed on the basis of the possible interactions that may actually be enacted as an effect of performing this experiment, as far as the system can tell from its experience.

For the anticipate() function to work similarly with composite interactions as it does with primitive interactions, composite interactions must also be associated with experiments. In fact, the system must learn that a composite interaction corresponds to an abstract experiment performed with reference to an abstract environment (the reactive part) that returns an abstract result.

See how this problem fits nicely with the constructivist learning challenge (introduced in the readings on Page 36): learning to interpret sensorimotor interactions as consisting of performing experiments on an external reality, and to interpret the results of these experiments as information about that reality.

The rest of this page presents our first step towards addressing this challenge. We will develop this question further in the next lesson.

Recursively learning composite interactions

Figure 44 illustrates the implementation of the learnCompositeInteraction() function so as to implement recursive learning of a hierarchy of composite interactions.

Figure 44: Recursive learning of composite interactions.

Figure 44 distinguishes between the Interaction Time (arrow at the bottom corresponding to the agent/environment coupling) and the Decision Time (staircase shaped arrow corresponding to the proactive/reactive coupling that rises over time). The learning occurs at the level of the Decision Time to learn higher-level composite interactions on top of enacted composite interactions. In gray rectangles indicate the composite interactions that are learned or reinforced at the end of decision cycle t_d. The system learns the composite interaction ⟨e_cd-1,e_cd⟩ made of the sequence of the previous enacted composite interaction e_cd-1 and the last enacted composite interaction e_cd. This is similar to Page 32 except that the learning can apply to composite interactions rather than primitive interactions only. Additionally, the system learns the composite interaction ⟨e_cd-2,⟨e_cd-1,e_cd⟩⟩. This way, if e_cd-2 is enacted again, ⟨e_cd-2,⟨e_cd-1,e_cd⟩⟩ will be re-activated and will propose to enact its post-interaction ⟨e_cd-1,e_cd⟩. The system has thus learned to re-enact ⟨e_cd-1,e_cd⟩ as a sequence, hence the self-programming effect. The higher-level composite interaction ⟨⟨e_cd-2,e_cd-1⟩,e_cd⟩ is also learned so that it can be re-activated in the context when ⟨e_cd-2,e_cd-1⟩ is enacted again, and propose its post-interaction e_cd.

Associating abstract experiments and results with composite interactions

When a new composite interaction i_c is added to the set I_d of known interactions at time t_d, a new abstract experiment e_a is added to the set E_d of known experiments at time t_d, and a new abstract result r_a is added to the set R_d of known results at time t_d, such that i_c = ⟨e_a,r_a⟩.

Abstract experiments are called abstract because the environment cannot process them directly. The environment (or robot's interface) is only programmed to interpret a predefined set of experiments that we now call concrete. To perform an abstract experiment e_a, the agent must perform a series of concrete experiments and check their results. That is, the agent must try to enact the composite interaction i_c from which the abstract experiment e_a was constructed.

If the chooseExperiment() function chooses experiment e_a, then the system tries to enact i_c. If this tentative enaction fails and results in the enacted composite interaction e_c ∈ I_d+1, then the system creates the abstract result r_f ∈ R_d+1, so that e_c = ⟨e_a,r_f⟩ .

The next time the system considers choosing experiment e_a, it will compute the proclivity for e_a based on the anticipation of succeeding enacting i_c and getting result r_a, balanced with the anticipation of actually enacting e_c and getting result r_f.

As a result of this mechanism, composite interactions can have two forms: the sequential form ⟨pre-interaction,post-interaction⟩ and the abstract form ⟨experiment,result⟩. We differentiate between these two forms by noting abstract experiments and results in initial caps separated by the "|" symbol: ⟨EXPERIMENT|RESULT⟩. We will use this notation in the trace in Page 46.

This mechanism is a critical step to implementing self-programming agents. Nevertheless, it opens many questions that remain to be addressed. For example, how to organize experiments and results to construct a coherent model of reality?

« Previous | Next »

See public discussions about this page or start a new discussion by clicking on the Google+ Share button. Please type the #IDEALMOOC044 hashtag in your post:

Lessons:

Implementation of DEvelopmentAl Learning (IDEAL) Course

The construction of reality in the developmental agent

Anticipating the effects of composite interactions

Recursively learning composite interactions

Associating abstract experiments and results with composite interactions