The high performance in the `bandrobot` test may be accidental #288

ARCJ137442 · 2024-10-10T08:31:30Z

Background

The bandrobot test, which is one of a demo in ONA, is aiming to test the multistep event inferencing/subgoaling of ONA reasoner (by NAL-7 & NAL-8 temporal/procedural inferencing)

The scene generated by ASCII art is like:

+++++++++++++++++++++|
---------------------|
            A        |
 o                   |
'''U'''''''''''''''''|

This is a singleplayer game and the main goal is to controll the robot A, pick the ball o and drop it into the bucket U.
In this game, ONA is expected to learn the procedual knowledge from comparing frequency of beliefs (corresponding to the relative position between the robot and the ball/bucket), which is logical represented by inference rule { <{S1} |-> [P]>, <{S2} |-> [P]> } |- <({S1} * {S2}) --> (+ P)>(t_frequency_greater) and { <{S1} |-> [P]>, <{S2} |-> [P]> } |- <({S1} * {S2}) --> (= P)>(t_frequency_equal).
Using the representation of relative position, ONA is able to learn "pick when the position of the robot is equal to the ball, drop when the position of the robot is equal to the bucket, move left/right to make the position between them equal", thereby provide a proof of that ONA has a efficient procedual learning machanism (sensorimotor intelligence).

Problem

As the title says, although ONA can have high performance on this game by self-learning currently.

However, if we change the random seed of the whole reasoner, it might be seen that the high performance of ONA in this game is accidential:

If the reasoner not babble "precisely", the robot can't learn any effective knowledge to achieve the goal.
Although the robot achieve the goal by coincident, if the second goal satisfaction arrive later, the "right knowedge" represented by temporal implications will faded out and the reasoner will fall back into the "random babbling without decisions" status, like the accidential experience of success is never happened.

Pictures

The successful case on `mysrand(666)`

Failing cases on `mysrand(667)` and `mysrand(668)`

The text was updated successfully, but these errors were encountered:

patham9 · 2024-10-10T09:05:46Z

I agree, robust learning is not achieved for this particular example.
I also have a test script which runs it with different seeds to evaluate it, I can commit it soon.

Part of the problem is that by design of this experiment, reward can only obtained in the very rare case that the object at the right position is picked up and then dropped at the target location, which is a rare occasion with motor babbling and when it happens there are tons of other hypotheses to weed out.

The solution will be to take what we learned from NACE and add the corresponding curiosity model to ONA:
https://github.com/patham9/NACE

Another immanent change: the numeric representation is the initial incomplete one that has been experimentally added.
In the meanwhile there is a solid implementation of numeric spaces which allows the system to both condition on concrete values and to perform comparisons between numeric measurements.
With this new numeric value handling learning also seems way more robust:
http://91.203.212.130/AniNAL/demo_complex_continuous_verbal.html

ARCJ137442 · 2024-10-10T09:12:22Z

@patham9 Okay, I'll study these references later.

ARCJ137442 changed the title ~~The high performance in the bandrobot test maybe accidental~~ The high performance in the bandrobot test may be accidental Oct 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The high performance in the `bandrobot` test may be accidental #288

The high performance in the `bandrobot` test may be accidental #288

ARCJ137442 commented Oct 10, 2024

patham9 commented Oct 10, 2024

ARCJ137442 commented Oct 10, 2024

The high performance in the bandrobot test may be accidental #288

The high performance in the bandrobot test may be accidental #288

Comments

ARCJ137442 commented Oct 10, 2024

Background

Problem

Pictures

The successful case on mysrand(666)

Failing cases on mysrand(667) and mysrand(668)

patham9 commented Oct 10, 2024

ARCJ137442 commented Oct 10, 2024

The high performance in the `bandrobot` test may be accidental #288

The high performance in the `bandrobot` test may be accidental #288

The successful case on `mysrand(666)`

Failing cases on `mysrand(667)` and `mysrand(668)`