-
Notifications
You must be signed in to change notification settings - Fork 118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Eliminate E_list #971
Eliminate E_list #971
Conversation
I forgot to add changes for the SAT parser. Note also that I didn't really convert commented-out functions (like the ones for infix notation), and also not the corpus stuff (they gotr partially converted by the initial elimination of the |
Still checking the SAT parser conversion - I will be able to finish only tomorrow. |
It turns out I first must fix issue #932, as the SAT parser somehow mishandles long expression chains. It looks like some subexpressions are skipped, but I'm not sure yet. I will have to inspect a detailed debug output from the last good commit and the first bad commit. This may take me several days more. Meanwhile, please disregard this PR. |
Maybe it is time to remove that.
It seems very unlikely that this will ever be revived, and should almost surely be removed, |
Prevent warnings in debug compilation.
1. Add field "operand_next" to replace the current E_list 'next". 2. Rename "l" to "operand_first" for readability. 3. Use unnamed union (seems to me more readable here).
Implement that by merging Exp and E_list.
Fixed in PR #972. The main work for removing E_list in the SAT-parser was fixing BTW, now the classic parser is very much faster than the SAT parser even for long sentences. I also have WIP changes to make the SAT parser much faster (also still need cleanup). But it seems that the classic parser will still be much faster after that. (I'm not sure about their relative speed after I implement my next ideas regarding yet speeding up both because the speedup potential in each is big.)
It can be merged now. |
I was keeping a diary of performance measurements and it was really quite fun to watch things get faster. Very sadly, I accidentally deleted it a few weeks ago, and don't have backups :-( |
After this merge, I get a build failure:
This is for |
I use one of the latest gcc versions.
I will submit a fix that works in version 6 too.
בתאריך יום ה׳, 18 ביולי 2019, 22:48, מאת Linas Vepštas <
[email protected]>:
… After this merge, I get a build failure:
../../../link-grammar/sat-solver/util.cpp: In function ‘Exp* null_exp()’:
../../../link-grammar/sat-solver/util.cpp:50:3: sorry, unimplemented: non-trivial designated initializers not supported
};
^
Makefile:478: recipe for target 'util.lo' failed
This is for gcc version 6.3.0 20170516 (Debian 6.3.0-18+deb9u1)
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#971?email_source=notifications&email_token=AAGWXGWFRSVIFQIAPVGH24LQADCJ3A5CNFSM4ICRWBA2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD2JSUJA#issuecomment-512961060>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAGWXGRPFDSQGRFH6MZGQOLQADCJ3ANCNFSM4ICRWBAQ>
.
|
If this is the case, there is a question: should the SAT parser be continued to be maintained, or should it be removed? It was originally created to overcome the perception of performance issues. But I've never actually used it in any large-scale projects, and I mostly don't much track how it's performance changes as the result of your work. The only reason I can think of to keep SAT around is that, for some unknown, futuristic variant of LG that allows link-crossing (non-planar graphs), maybe that would be easier to implement in SAT than the classic parser. Right now, its kind-of hard to imagine why this would be needed, at least for natural language. The only other strange variation I can think of is the case where every word can belong to 100 different word-classes, in which case there is an awful combinatoric explosion. But its unclear that SAT could offer any benefit for this, either ... |
Removal added to my TODO list. |
No huge performance change over time. Mainly bug fixes to make its output equivalent to the classic parser, and maintenance when it gets broken due to LG library changes. It needs to be made several times faster to be faster on short sentences. On long sentences, I think that even 10 times faster would not help. But both are definitely possible, and it may be technically interesting.
It is very easy to allows link-crossing - mainly commenting out some code.
The current SAT parser also has performance problem with a huge number of connectors. However, both parsers may be enhanced to be much more scalable.
I don't expect that any of my current WIPs will severely break the SAT parser like the Exp changes did |
Per discussion at PR opencog#971.
It seems it is more severely not up to date, since I get: Please advise. |
? It's installed in |
what needs to change is that we should no longer install |
Ah, I didn't install LG on purpose to make sure I will not get system headers by mistake in my development code... (but now that hopefully all the Since now the relevant headers files are in subdirectories under |
This can be done as follows, please tell me if this is fine:
|
I can take care of porting the opencog code to the new API; I don't expect it to be hard, just 6-12 lines of code total.
Sorry, I was confused; installing that file is just fine, I see no problem with it.
Sure, rename s needed for consistency. Yes, using a static with TLS sounds like a good idea. |
But should I rename the existing |
It is obsolete and unsupported. Per discussion at PR opencog#971.
I have no opinion. Most of the rest of the API is dictionary_blah and linkage_blah so expression_blah follows that pattern. |
It is obsolete and unsupported. Per discussion at PR opencog#971.
So I will leave the name |
These are found in void print_expression(const Exp *);
char *expression_stringify(const Exp *); But the comment at the start of this file say this is public API. In any case, I think that |
yes sounds good
I'm pretty sure no one is using it. |
I still not understand, as So these whole |
I guess that I will need to add the But I think that we should **also move them to |
In the past, Apple was the only OS that paid attention to what was exported, and so any ommissions in this file had to wait for Apple users to report them. Very recently, some version of Linux showed up that also pays attention to this file...
The only time this function was ever used by anybody was in opencog, during debugging. It is not used normally, and is not in the code base. I think it is very unlikely that there are any other users out there, or that anyone would ever use it except during debugging.
I'd rather not. The stuff in |
In fact, if you are bored and looking forr something new to do -- the language generation subsystem needs help, The idea is that you give it a bag of concepts, relations, actions, and these will get serialized into some grammatically valid sentence. (e.g. (Concept "Amir") (Concept "pizza") (Action "eat") (Time "past") and a sentence falls out "Amir ate pizza") The "bag of stuff" is informal and is not strictly defined; any sentence that approximates the bag of stuff is acceptable. There are some interesting things to think about with the "bag of stuff" needs to be, and how to convert that into sentences. The current code for this is called "sureal" (surface realization) and "microplannning" (to plan out multiple sentences) -- they don't work very well. |
I thought that the intention was to use the Is it still the case? |
I think the code is using the |
The "problem" is that I am not bored at all with the LG stuff. But I'm open to interleave it with other tasks, see below. Currently I try to convert to clean pull request mostly existing WIP branches, some of them very old. Now to the "see below" stuff. |
You are right - I located that. |
BTW, one of my TODOs is a simple way to eliminate the |
Well, that's good! You are good at what you do, and I was just thinking that there are these other systems that need this kind of love and attention.
I can try to review them faster, if it helps. Also, I am comfortable with 95% of the changes that you make, so you could start merging some of these yourself, to speed things up. There is very little in your pull reqs that I ever look at and find questionable or objectionable. |
Yes, it would very help, as the amount of PRs I would like to finally send is astonishing big, and I have a big list of TODOs (among other things, almost every open issue is at least one TODO).
The problem is that I'm never sure what it will be, and had a poor luck in guessing that... |
Getting it right 90% of the time is not "poor luck". |
It seems I introduced a bug in this PR into EDIT: Fixed in PR #988. |
The main change here is the elimination of the E_list struct. I think the code is much clearer without it.
See my post in issue 718.
Add expression traversal API calls per the proposal there. Since most of the
lg_exp_*()
API are inline functions, the Exp struct definition still needs to be public.(But maybe these calls can be made non-inline and moved to
link-includes.h
along withExp_type
and an opaque Exp definition.)In the same occasion, change
copy_Exp()
andexp_compare()
to iterate the expression operands instead of making recursive calls, to prevent extra stack usage on a very big operand list.This change increases the Exp struct back to 32 bytes (it got reduced to 24 bytes in the recent PR #969).
If needed, it can be reduced again to 24 bytes using this definition:
I will check the exact effect of the Exp struc sized when I implement my cost definition cleanup proposal in issue 783.
The total speedup is small - about 3% for the benchmarks of the
en
basic & fixes batches.N.B If desired, I can also send PR for the needed change in
LGDictExpContainer.cc
.