Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sharing parse choice elements #1419

Open
linas opened this issue Feb 7, 2023 · 8 comments
Open

Sharing parse choice elements #1419

linas opened this issue Feb 7, 2023 · 8 comments

Comments

@linas
Copy link
Member

linas commented Feb 7, 2023

I have an idea of how to memory-spare parse choice elements, just like parse set elements are shared through the x-table.
Currently, they cannot be shared because each has a unique value.
I thought of these data structures (typedefs omitted):

struct Parse_set_struct
{
   Parse_choice_desc *desc;
   ...
};
struct Parse_choice_desc_struct
{
   Parse_choice_desc *next;
   Parse_choice_lr *l_pc, *r_pc;
};
struct Parse_choice_lr
{	
   Parse_set * set;
   Disjunct    *md;           /* the chosen disjunct for the middle word */
   int32_t id;                /* the tracon (l_id for l_pc and r_id for r_pc  */
};

(What is now set[1] and set[2] will be in l_pc and r_pc, correspondingly.)
Of course, there is a question of how much sharing will be done (if any, I don't know even that) and what will be the CPU overhead.

Originally posted by @ampli in #1402 (reply in thread)

@linas
Copy link
Member Author

linas commented Feb 7, 2023

I was going to write something clever here, but I now realize I don't understand the idea. It would seem like this results in two copies of Disjunct *md; instead of just one. There's also an extra int32_t id; that uses more space. I don't understand what Connector *le, *re; are. I don't understand which of these structs you think is sharable...

@ampli
Copy link
Member

ampli commented Feb 7, 2023

Sharing this one:

struct Parse_choice_lr
{	
   Parse_set * set;
   Disjunct    *md;           /* the chosen disjunct for the middle word */
   int32_t id;                /* the tracon (l_id for l_pc and r_id for r_pc  */
};

two copies of Disjunct *md; instead of just one

You are right. It should be moved to Parse_choice_desc_struct.

I don't understand what Connector *le, *re; are.

Where? They are not used here.

Also note that id replaces what is now Parse_choice::lc and Parse_choice::rc (depending if this is l_pc or r_pc).
A connector pointer can be used instead, but it seems that using tracon_id will lead to more sharing (to be checked).

I didn't explain the basis of this idea:
Currently Parse_choise has information of both sides at once., This make all of its elements unique so they cannot be shared. The idea is to separate it to LHS and RHS like set[0] and set[1].

@linas
Copy link
Member Author

linas commented Feb 7, 2023

id replaces what is now Parse_choice::lc

Ah! OK, that makes sense now.

The Connector *le, *re; are in the Parse_set. I'm guessing wildly that they are connectors on Parse_set_struct::lw and rw? If so, can they be removed and replaced by get_tracon(Parse_set_struct::l_id) ?

@ampli
Copy link
Member

ampli commented Feb 7, 2023

The Connector *le, *re; are in the Parse_set. I'm guessing wildly that they are connectors on Parse_set_struct::lw and rw? If so, can they be removed and replaced by get_tracon(Parse_set_struct::l_id) ?

Yes. But the "encoding for parsing" step would then need to produce a tracon table (to get the tracon address by its number). Such a table is produced by the "encoding for pruning" step, which shares most of its code, so it is easy to produce it. However, we will need to find how much overhead, if any, this adds.

@linas
Copy link
Member Author

linas commented Feb 7, 2023

would then need to ...

OK. I guess we can leave this alone for now. Things are complicated enough, as it is.

@ampli
Copy link
Member

ampli commented Feb 7, 2023

I guess we can leave this alone for now. Things are complicated enough, as it is.

But the opposite thing, i.e. remove int32_t l_id, r_id; seems easy and without overhead (unless I missed something) so I will try that.

@ampli
Copy link
Member

ampli commented Feb 12, 2023

The Connector *le, *re; are in the Parse_set. I'm guessing wildly that they are connectors on Parse_set_struct::lw and rw? If so, can they be removed and replaced by get_tracon(Parse_set_struct::l_id) ?

I found a way to replace Connector *lc, *rc; in Parse_choice with int32_t l_id, r_id; (saving 8 bytes) and get rid of int32_t l_id, r_id; in Parse_set w/o a noticeable overhead (PR soon).

BTW, by producing a tracon/disjunct mapping table (while encoding for parsing), it seems possible to also get rid of Disjunct *md; in Parse_choice. Even w/o additional table mapping, with some small overhead, it is possible to get rid of uint8_t lw, rw; in Parse_set, but then we still have uint8_t null_count that prevents the saving of 8 bytes. However, it is possible to get rid even of it if an x_table per null_count is implemented (which can be a good idea anyway).

@ampli
Copy link
Member

ampli commented Feb 19, 2023

it seems possible to also get rid of Disjunct *md

I Could not do it after all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants